eggers

Information about eggers

Published on March 11, 2008

Author: Toni

Source: authorstream.com

Content

The Informative Role of WordNet in Open-Domain Question Answering:  The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February 17, 2004 Introduction:  Introduction Information Extraction: not just for keywords anymore! Massive document collections (databases, webpages) require more sophisticated search techniques than keyword matching Need way to focus and narrow search  improve precision One solution: Open-Domain Q/A Find answers to natural language questions from large document collections Examples: “What city is the capital of the United Kingdom?” “Who is the first private citizen to fly in space?” Text Retrieval Conferences (TREC) evaluate entered systems; show that this sort of task can be performed with “satisfactory accuracy” (Voorhees, 2000) Q/A: Previous Approach:  Q/A: Previous Approach Captures the semantics of the question by recognizing expected answer type (i.e., its semantic category) relationship between the answer type and the question concepts/keywords The Q/A process: Question processing – Extract concepts/keywords from question Passage retrieval – Identify passages of text relevant to query Answer extraction – Extract answer words from passage Relies on standard IR and IE Techniques Proximity-based features Answer often occurs in text near to question keywords Named-entity Recognizers Categorize proper names into semantic types (persons, locations, organizations, etc) Map semantic types to question types (“How long”, “Who”, “What company”) Problems:  Problems NE assumes all answers are named entities Oversimplifies the generative power of language! What about: “What kind of flowers did Van Gogh paint?” Does not account well for morphological, lexical, and semantic alternations Question terms may not exactly match answer terms; connections between alternations of Q and A terms often not documented in flat dictionary Example: “When was Berlin’s Brandenburger Tor erected?”  no guarantee to match built Recall suffers WordNet to the rescue!:  WordNet to the rescue! WordNet can be used to inform all three steps of the Q/A process 1. Answer-type recognition (Answer Type Taxonomy) 2. Passage Retrieval (“specificity” constraints) 3. Answer extraction (recognition of keyword alternations) Using WN’s lexico-semantic info: Examples “What kind of flowers did Van Gogh paint?” Answer-type recognition: need to know (a) answer is a kind of flower, and (b) sense of the word flower WordNet encodes 470 hyponyms of flower sense #1, flowers as plants Nouns from retrieved passages can be searched against these hyponyms “When was Berlin’s Brandenburger Tor erected?” Semantic alternation: erect is a hyponym of sense #1 of build Interactions between WN and Q/A:  Interactions between WN and Q/A Expected Answer Type Keyword Alternations Question Processing Document Processing Answer Processing Index Passage Retrieval Answer Extraction Question Documents Answer(s) WordNet WN in Answer-type Recognition:  WN in Answer-type Recognition Answer Type Taxonomy a taxonomy of answer types that incorporates WN information Acts as an “ontological resource” that can be searched to identify a semantic category (representing answer type) Used to associate found semantic categories with a named entity extractor So, still using an NE, but not bound to proper nouns; have found a way to map NEs to more general semantic categories Developed on principles conceived for Q/A environment (rather than as general onto principles) Principle 1: Different parts of speech specialize the same answer type Principle 2: Selected word senses are considered Principle 3: Completeness of the top hierarchy Principle 4: Conceptual average of answer types Principle 5: Correlating the Answer Type Taxonomy with NEs Principle 6: Mining WordNet for additional knowledge Answer Type Taxonomy (example):  Answer Type Taxonomy (example) WN in Passage Retrieval:  WN in Passage Retrieval Identify relevant passages from text Extract keywords from the question, and Pass them to the retrieval module “Specificity” – filtering question concepts/keywords Focuses search, improves performance and precision Question keywords can be omitted from the search if they are too general Specificity calculated by counting the hyponyms of a given keyword in WordNet Count ignores proper names and same-headed concepts Keyword is thrown out if count is above a given threshold (currently 10) WN in Answer Extraction:  WN in Answer Extraction If keywords alone cannot find an acceptable answer, look for alternations in WordNet! Evaluation:  Evaluation Paşca/Harabagiu approach measured against TREC-8 and TREC-9 test collections WN contributions to Answer Type Recognition Count number of questions for which acceptable answers were found; 3GB text collection, 893 questions Evaluation (2):  Evaluation (2) WN contributions to Passage Retrieval Impact of keyword alternations Impact of specificity knowledge Conclusions:  Conclusions Massive lexico-semantic information must be incorporated into the Q/A process Using such information encoded in WN improved system precision by 147% (qualitative analysis) Visions for future: Extend WN so that online resources like encyclopedias can link to WN concepts Answer questions like: “Which classic rock group first performed live in Alburquerque?” Further improve Q/A precision with WN extension projects Eg, “finding keyword morphological alternations could benefit from derivational morphology, a project extension of WordNet” (Harabagiu et al., 1999)

Related presentations


Other presentations created by Toni

The 5 Regions of Georgia
16. 01. 2008
0 views

The 5 Regions of Georgia

Multithreading
19. 03. 2008
0 views

Multithreading

MieleGlassWasherBene fits
08. 01. 2008
0 views

MieleGlassWasherBene fits

Homemade Bread
10. 01. 2008
0 views

Homemade Bread

BEARAWARE
14. 01. 2008
0 views

BEARAWARE

Animal World9 ppt1
15. 01. 2008
0 views

Animal World9 ppt1

SADP Hunan
19. 01. 2008
0 views

SADP Hunan

11 Andrew Jones
22. 01. 2008
0 views

11 Andrew Jones

ScottHorowitz112806
21. 01. 2008
0 views

ScottHorowitz112806

L8 Rebirth Without A Soul
05. 02. 2008
0 views

L8 Rebirth Without A Soul

kaplan
05. 02. 2008
0 views

kaplan

2 1 07 11 00am Ashley
10. 01. 2008
0 views

2 1 07 11 00am Ashley

ADHDintheClassroom1
29. 01. 2008
0 views

ADHDintheClassroom1

6 FredStarr
06. 02. 2008
0 views

6 FredStarr

Nickel BMO
14. 02. 2008
0 views

Nickel BMO

Szeliski Being There
28. 02. 2008
0 views

Szeliski Being There

ERGONMCS
05. 03. 2008
0 views

ERGONMCS

leading diversity
15. 03. 2008
0 views

leading diversity

06 ORBs
21. 03. 2008
0 views

06 ORBs

science olympics
07. 02. 2008
0 views

science olympics

webcast 2006 07 21
08. 04. 2008
0 views

webcast 2006 07 21

LGBT Health Curriculum Final
04. 02. 2008
0 views

LGBT Health Curriculum Final

3977
16. 04. 2008
0 views

3977

A New Coursefor KS3 French
24. 04. 2008
0 views

A New Coursefor KS3 French

Non country DPMM 11 05
22. 01. 2008
0 views

Non country DPMM 11 05

27 Help Ive Got ELLs 218832 7
10. 01. 2008
0 views

27 Help Ive Got ELLs 218832 7

1751
08. 05. 2008
0 views

1751

200610gloream paris mircea
20. 02. 2008
0 views

200610gloream paris mircea

BOB PURGASON PRESENTATION
24. 01. 2008
0 views

BOB PURGASON PRESENTATION

Schoenborn 2BK 3AK 3BK 5BK
18. 02. 2008
0 views

Schoenborn 2BK 3AK 3BK 5BK

Group C Tara Toellner
11. 02. 2008
0 views

Group C Tara Toellner

moorea
04. 02. 2008
0 views

moorea

1 abuse HK 2007
17. 01. 2008
0 views

1 abuse HK 2007