Initial Named Entity Efforts
From DocWiki
| Overall Content Model |
| Where Activities Reside |
| Activities in Phase 1 |
| Phase 2 - Initial Standup and Proof-of-Concept |
|
Contents |
Activity: Begin Named Entity Collection and Processing
Objective
The Begin Named Entity Collection and Processing activity is designed to: 1) gain experience in notable entity (named entity) discovery and characterization; and 2) to characterize sufficient named entities in order to provide a compelling proof-of-concept. This activity is complementary to concept identification and thus is the second contributing part to the underlying scones methodology.
Efficient means to identify and harvest entity structure data is an emphasis of this activity. As such, it brings in portions of critical thinking, Web page and data scrubbing, and entity data formatting and import.
Major Deliverables
- Multiple entity lists for use by the ANNIE information extraction system
- Instance records with some attributes for the entity types selected for this phase
- irON-compliant data files of those entity records and attributes sufficient for import into the system
Tasks
List as sub-heads the key tasks:
Task 1: Identify Notable Entity Types
[ Edit Task ]
| Phase: | Phase 2 |
| Activity: | Initial Named Entity Efforts |
| Sort Order: | 1 |
| Page Status: | Unchecked |
| Include Task?: | Used |
| Task Status: | Active |
| Short Name: | Begin NE |
| URL: | |
| Description: | Provide a brief description of the Task. |
| Objective: | Provide brief statement of Objective. |
| Inputs: | Please provide:
|
| Outputs: | Please provide:
|
| Staffing: | |
| Total Hrs: | |
| Duration: | |
| Start Time: | |
| Cost: |
Place whatever "free text" you wish here; it will appear at the bottom of the task listing.
Task 2: Identify Sources for Named Entities
[ Edit Task ]
| Phase: | Phase 2 |
| Activity: | Initial Named Entity Efforts |
| Sort Order: | 2 |
| Page Status: | Unchecked |
| Include Task?: | Used |
| Task Status: | Active, Incomplete |
| Short Name: | NE Sources |
| URL: | |
| Description: | Provide a brief description of the Task. |
| Objective: | Provide brief statement of Objective. |
| Inputs: | Please provide:
|
| Outputs: | Please provide:
|
| Staffing: | |
| Total Hrs: | |
| Duration: | |
| Start Time: | |
| Cost: |
Place whatever "free text" you wish here; it will appear at the bottom of the task listing.
Task 3: Harvest and Process Named Entity Records
[ Edit Task ]
| Phase: | Phase 2 |
| Activity: | Initial Named Entity Efforts |
| Sort Order: | 3 |
| Page Status: | Unchecked |
| Include Task?: | Used |
| Task Status: | Active |
| Short Name: | Process NE Records |
| URL: | |
| Description: | Provide a brief description of the Task. |
| Objective: | Provide brief statement of Objective. |
| Inputs: | Please provide:
|
| Outputs: | Please provide:
|
| Staffing: | |
| Total Hrs: | |
| Duration: | |
| Start Time: | |
| Cost: |
Place whatever "free text" you wish here; it will appear at the bottom of the task listing.
Task 4: Create Lists and Gazetteers
[ Edit Task ]
| Phase: | Phase 2 |
| Activity: | Initial Named Entity Efforts |
| Sort Order: | 4 |
| Page Status: | Unchecked |
| Include Task?: | Used |
| Task Status: | Active, Incomplete |
| Short Name: | Lists |
| URL: | |
| Description: | Provide a brief description of the Task. |
| Objective: | Provide brief statement of Objective. |
| Inputs: | Please provide:
|
| Outputs: | Please provide:
|
| Staffing: | |
| Total Hrs: | |
| Duration: | |
| Start Time: | |
| Cost: |
Place whatever "free text" you wish here; it will appear at the bottom of the task listing.
Task 5: Prepare and Import commON Entity Records
[ Edit Task ]
| Phase: | Phase 2 |
| Activity: | Initial Named Entity Efforts |
| Sort Order: | 5 |
| Page Status: | Unchecked |
| Include Task?: | Used |
| Task Status: | Active |
| Short Name: | commON Records |
| URL: | |
| Description: | Provide a brief description of the Task. |
| Objective: | Provide brief statement of Objective. |
| Inputs: | Please provide:
|
| Outputs: | Please provide:
|
| Staffing: | |
| Total Hrs: | |
| Duration: | |
| Start Time: | |
| Cost: |
Place whatever "free text" you wish here; it will appear at the bottom of the task listing.
Core Supporting Assets
List core or key Supporting Assets with explanations or expansions as appropriate.
- ANNIE
- A commON Case Study Using Sweet Tools
- GATE
- Guidelines for Identifying Notable Entity Types
- irON Specification
- regex Guidance
- scones
Yellow Flags
Areas to look out for include:
- One concern area is to spent an imbalanced percentage of time on this activity viz. other structured data activities (e.g., ontologies, overall vocabulary and framework, indicators, data collection)
- List Yellow Flag #2 (with any links)
- etc.
Key Resource Requirements
Key resources required for this activity include:
- List Key Resource #1 (with any links)
- List Key Resource #2 (with any links)
- etc.
Potential Changes to this Activity
Some potential changes to this activity in future versions could include:
- Improved work flows from later phases may act to remove or streamline some of the intermediate tasks
- Better and more comprehensive access to electronic files or databases may act to remove or streamline some of the intermediate tasks
- A switch-out or replacement of the underlying GATE component to scones (say, by a transition to UIMA) may significantly alter the workflow and tasks for this activity.
- etc.