chalmers

Information about chalmers

Published on September 14, 2007

Author: Alien

Source: authorstream.com

Content

Detecting CoreferenceProcessing discourse reference:  Detecting Coreference Processing discourse reference Christer Johansson, UiB Those little words:  Those little words Typical anaphora Pronouns The cat bit the dog because it was angry. Definite Nouns I saw a cat and a dog. The cat was chasing the dog. Those little words:  Those little words Pronouns We often think of pronouns as some kind of place holder, or a variable. Pronouns have some internal structure Gender Number Case ... Those little words:  Those little words Nouns Do not confuse words with what they refer to. 'Pigs can fly.' is a grammatical sentence even if it is not true. It is also meaningless if we do not know what 'pig' is, or the relations between all three words. Words like nouns are in a sense also variables, although they are more restricted. Compare 'cat' with 'it'. More Anaphora:  More Anaphora Less typical Predication John drives a taxi, and Joe studies math. Which one would you like to meet, the taxi driver or the math student? Verb anaphora Ann sings in the shower. The hollering last half an hour. Coreference:  Coreference 'Coreference occurs when the same person, place, event, or concept is referenced more than once in a single document.' (Amit Bagga) Extension of Coreference:  Extension of Coreference Cross Document Coreference '... occurs when the same person, place, event, or concept is referenced more than once in multiple sources' (Amit Bagga) Essential Information Retrieval problem. Are two documents about the same things? Extension of Coreference:Images:  Extension of Coreference: Images Amid Bagga: Significant TV broadcasts are repeated across and within stations. Combining text and image recognition can aid detection of coreferent events. CBS 2829 CBS 3873 NBC 3885 NBC 5061 Applications:  Applications Information Extraction and Retrieval:  Information Extraction and Retrieval Q/A systems Q: Who was king of Norway 1985? A: #He was king of Norway 1985. Reference may find good keywords. Themes are often referred. Reference more than word form A simplistic example:  A simplistic example The lion is the king[+1] of the jungle. She[+2] hunts mostly at night. The females[+3] live in groups. The male[+4] is much larger, but _ [+5] lives alone. Word form: 'Lion' 1 of 26 words Reference: 'Lion' 6 of 26 words The significance of 'lion' increases. Machine Translation:  Machine Translation Det satt en katt-i på bordet-j. Heldigvis sto det-j stille. Heldigvis sto den-i stille. There was a cat on the table. Fortunately, it was standing still. Without co-reference: An unambiguous sentence becomes ambiguous. Important when translating between case, gender or aspect marking languages. Machine Translation:  Machine Translation The monkey ate the banana because ... 1) it was hungry hungry(it=monkey) 2) it was ripe ripe(it=banana) 3) it was tea time eat(Agent, Food, When=tea time) ProsodyText-to-Speech systems:  Prosody Text-to-Speech systems A given is seldom stressed (Horne andamp; Johansson 1991) I will never sell my dog. I LOVE the old mutt. If old mutt and my dog are coreferent, stress is more likely to move to some other (new) information. Disambiguation:  Disambiguation The lion roams the savannah. If there is no antecedent, assume definite np refers more generally (to the species). Cognitive Giveness Hierarchy A referent must be uniquely identifiable. There is only one species of lion. There are many individual lions Disambiguation:  Disambiguation Hunden knäckte benet med sina käkar. The dog broke the bone with his jaws. The dog broke the leg with his jaws. ? The dog broke his leg with his jaws. If 'benet' can be identified in a reference chain that identifies it, we are more likely to get the correct translation. Factors:  Factors Gender:  Gender Grammatical gender Det sto en katt-x på bordet-y. Den-x/det-y sto heldigvids stille. There was a cat on the table. Luckily, it kept still. Natural Gender Den nye eleven-x likte sykkelen-y. Han-x/hun-x/den-y var rask. The new student liked the bike. He/She/It was fast. Function of the antecedent:‘centering’:  Function of the antecedent: ‘centering’ Kari-S var sent ute, så hun-S ringte søsteren-O sin-S. Hun-S skrek inn i røret. Kari was late so she called her sister. She yelled into the blower. She most likely refers to Kari, as that choice keeps the focus on her. Determined Noun Phrase:  Determined Noun Phrase A cat and a dog were fighting outside. The dog howled like a wolf. Identify stems in scandinavian: hund / hunden. Look at that dog-i. I wouldn’t like to meet that beast-i alone. Compatible units through semantic network/world knowledge (ontologies). Negation:  Negation Ann didn’t see any woman. She was next door. Ann saw no woman. She was in another room. Ann didn’t talk to her. She was upset. 1) She = Ann 2) She = her Ann talked to her. She was upset. 1) She = her 2) She = Ann Ann talked to her, because she was upset. (?) Ann talked to her. She became upset. 1) She = her ? She = Ann Cause -andgt; Effect: Explanation:  Explanation The students protested, while the police observed. Then they attacked. The students protested, while the police observed. Then they began to throw stones. What do students do? And what do police do? Depends heavily on background knowledge We might extract some background knowledge from large collections of text, by observing statistical relations between subject and verb, and verb and object. Heavy NP:  Heavy NP A heavy NP is more likely to be referenced. Heavy NPs have more modifications. The small man with the black hat sat in the corner chatting with a clerk. He seemed relaxed. Embedded NPs are less likely to be referred than top level NPs. The clerk sat in the corner chatting with a small man with a black hat. He seemed relaxed. Interacts with other factors. Semantics:  Semantics Through part-whole relations It was a beautiful car. He sat behind the wheel. Through subordinate - superordinate He wants a dachshound, but I don’t know if he can take care of a dog. Through verb anaphora They captured Saddam last Sunday. The event was undramatic. More:  More Co-ordinated nps. The cat and dog were fighting. They got hurt. ? I saw a cat with one eye last night. It was horrible. The cat (?) The eye (?) Last night (?) The sight / situation (?) Challenges:  Challenges Noisy data underlying decisions Word class tags 95 - 98 % correct Functional roles, maybe 80% correct Spelling errors etc. Challenges:  Challenges Common knowledge is important, but difficult to model. Semantic Networks Ontologies: what exists in the world Dynamic Models: change over time Situational Semantics Fuzzy Logic Explanation Driven Processing. Challenges:  Challenges Highly ambiguous It is often not clear to humans exactly what is referred. We have place holding pronouns It rains. We have general reference, where possibly more than one thing is referred to some degree. Challenges: Fuzziness:  Challenges: Fuzziness C A R In Fuzzy Logic we can say that the car is 0.70 in parking pocket A and 0.30 in parking pocket B. This is not the same as saying it is in A with 0.70 probability. (it would then be either in A or somewhere else.) Similarly, reference might be to more than one thing to some degree, simultaneously. Developing a program:  Developing a program A Classification task:  A Classification task Machine Learning Decide yes/no for coreference. Soon, Ng, andamp; Lim, 2001. A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics, Vol. 27(4). Preprocessing: Tilburg Memory Based Learner: http://pi0657.uvt.nl/:  Preprocessing: Tilburg Memory Based Learner: http://pi0657.uvt.nl/ Input: Now is a tough time to be a computer maker. 1) tagging, 2) chunking, 3) functional role detection: [NP1Subject Now/RB ] [VP1 is/VBZ ] [NP1NP-PRD a/DT tough/JJ time/NN ] [VP2 to/TO be/VB ] [NP2NP-PRD a/DT computer/NN maker/NN ] An example of realistic input:  An example of realistic input [NP1Subject Sun/NNP Microsystems/NNPS ] ,/, [P along/IN ] {PNP [P with/IN ] [NP its/PRP$ rivals/NNS] } ,/, [VP1 has/VBZ had/VBD to/TO go/VB to/TO ] ``/`` [NP1Object warp/NN speed/NN ] and/CC [VP2 then/RB back/VB ] '/UNKNOWN ,/, [NP3Subject Scott/NNP McNealy/NNP ] ,/, [NP4Subject its/PRP$ chief/JJ executive/NN ] ,/, [VP3 said/VBD ] [NP3NP-TMP last/JJ week/NN ] ,/, [C as/IN ] [NP4Subject Sun/NNP ] [VP4 announced/VBD ] [C that/IN ] [NP5Subject it/PRP ] [VP5 would/MD make/VB ] [NP5Object a/DT larger-than-expected/JJ loss/NN ] {PNP [P in/IN ] [NP the/DT current/JJ quarter/NN ] } and/CC [VP6 would/MD lay/VB ] [PRT off/RP ] [NP6Object 3,900/CD workers/NNS ] ./. Machine Learning :  Machine Learning Train a match function for deciding the anaphor-antecedent relation. TiMBL Easy to expand the model when more data is available. Machine Learning: training :  Machine Learning: training We start from a large collection of examples. For each anaphor construct a match vector for each candidate mark the vector for antecedent (yes/no) The match is calculated for 9 features: string, lemma, suffix (form) subject, object, complement (of the same verb) same functional role grammatical gender number Machine Learning: testing :  Machine Learning: testing Construct match vectors for the nearest 40 candidates. Check the outcome with the large database For example, for the first candidate the nearest neighbor has 4 matching features. Collect from the database all exemplars with 3 or 4 matching features. Outcome: 90 no 10 yes Machine Learning: testing :  Machine Learning: testing Repeat for the 40 candidates. Outcome 1: 90 no 10 yes Outcome 2: 43 no 5 yes ... outcome 40: 120 no 10 yes How to decide for yes or no? Machine Learning: testing :  Machine Learning: testing Repeat for the 40 candidates. Outcome 1: 90 no 10 yes Outcome 2: 43 no 5 yes ... outcome 40: 120 no 10 yes How to decide for yes or no? We have decide that the most extreme is probably the best. We have to calculate the expected values for yes / no from the training set. Score = (Observedyes - Expectedyes) / std.devyes - (Observedno - Expectedno) / std.devno Conclusion 1(3) :  Conclusion 1(3) Coreference is a very general problem in natural language processing. It also extends into related domains for coreference of images. Establishing coreference has many applications: MT, IR, T2S, etc. Coreference is also a phenomena with inherent difficulties. Coreference might be vague and/or ambiguous. Coreference often depends heavily on background knowledge, which can be difficult to capture in a formal model. Conclusion 2(3) :  Conclusion 2(3) Using Machine Learning to adapt coreference with a textual domain gives us a general method to handle the problem. One problem of Machine Learning is to find the relevant features. Another problem is that the features we use often interact with each other. Vagueness and ambiguity often makes it impossible to select only one candidate for co-reference. Conclusion 3 :  Conclusion 3 Vagueness and ambiguity often makes it impossible to select only one candidate for co-reference. There is certainly a problem with evaluation, as some mistakes are more serious than others. Machine Learning of co-reference is still a young research field. Much work is needed, and many good ideas are certain to emerge. Thank you for listening:  Thank you for listening http://ling.uib.no/BREDT/ [email protected] [email protected] [email protected]

Related presentations


Other presentations created by Alien

Physical Security Lecture
05. 01. 2008
0 views

Physical Security Lecture

GREEK THEATRE
15. 10. 2007
0 views

GREEK THEATRE

Singapore National Symbols
14. 09. 2007
0 views

Singapore National Symbols

Origins of the Cold War
23. 12. 2007
0 views

Origins of the Cold War

CG43SlideSet
30. 04. 2008
0 views

CG43SlideSet

kaiser pres
28. 04. 2008
0 views

kaiser pres

GoldDifferences
22. 04. 2008
0 views

GoldDifferences

visn8
17. 04. 2008
0 views

visn8

Nov24 Regulatory approaches
16. 04. 2008
0 views

Nov24 Regulatory approaches

dr rom
14. 04. 2008
0 views

dr rom

file 6943
13. 04. 2008
0 views

file 6943

The Peak Oil Context Tom Petrie
10. 04. 2008
0 views

The Peak Oil Context Tom Petrie

H106g
09. 04. 2008
0 views

H106g

JapaneseGeography
07. 04. 2008
0 views

JapaneseGeography

Hamburg 2007
14. 09. 2007
0 views

Hamburg 2007

lfg
14. 09. 2007
0 views

lfg

Eddie Final Presentation
14. 09. 2007
0 views

Eddie Final Presentation

The Rain Forest Final
14. 09. 2007
0 views

The Rain Forest Final

ECAKnowledgeFair
12. 10. 2007
0 views

ECAKnowledgeFair

Ch18part1
15. 10. 2007
0 views

Ch18part1

WNV AVB 02212006
21. 10. 2007
0 views

WNV AVB 02212006

giraffe pp
14. 09. 2007
0 views

giraffe pp

giraffe powerpoint
14. 09. 2007
0 views

giraffe powerpoint

giraffe
14. 09. 2007
0 views

giraffe

COOL STUFF ABOUT GIRAFFES
14. 09. 2007
0 views

COOL STUFF ABOUT GIRAFFES

ub041104
23. 10. 2007
0 views

ub041104

STORY OF THEME AND PLOT
23. 10. 2007
0 views

STORY OF THEME AND PLOT

PhiladelphiaZooPPP
14. 09. 2007
0 views

PhiladelphiaZooPPP

qu10 11
01. 12. 2007
0 views

qu10 11

Angelos CME Energetics
02. 11. 2007
0 views

Angelos CME Energetics

pptPanama s
22. 10. 2007
0 views

pptPanama s

hirotani
13. 11. 2007
0 views

hirotani

bon2003 mpls
29. 10. 2007
0 views

bon2003 mpls

PROF AZZA
23. 10. 2007
0 views

PROF AZZA

Fenton
29. 10. 2007
0 views

Fenton

Countering Offshore
29. 12. 2007
0 views

Countering Offshore

walters082902
23. 11. 2007
0 views

walters082902

razbash
26. 11. 2007
0 views

razbash

DVMRPandMOSPF
01. 01. 2008
0 views

DVMRPandMOSPF

One 783Ngupta
04. 01. 2008
0 views

One 783Ngupta

Chapter 18 PPT
22. 10. 2007
0 views

Chapter 18 PPT

History of NAIS John Wiemers
20. 08. 2007
0 views

History of NAIS John Wiemers

BSP2D
14. 09. 2007
0 views

BSP2D

costarica1 ftparraud
22. 10. 2007
0 views

costarica1 ftparraud

mcmc2000a
06. 11. 2007
0 views

mcmc2000a

050317lc
16. 11. 2007
0 views

050317lc

ALA2003 OAI
04. 10. 2007
0 views

ALA2003 OAI

fwing
22. 10. 2007
0 views

fwing

acute 060727 transfusionmed
23. 10. 2007
0 views

acute 060727 transfusionmed

bckexpk3b
09. 07. 2007
0 views

bckexpk3b

anorexia
09. 07. 2007
0 views

anorexia

070207 Adjektiv
09. 07. 2007
0 views

070207 Adjektiv

A Brachmann
09. 10. 2007
0 views

A Brachmann

mueller jun07
19. 10. 2007
0 views

mueller jun07

Late Classic Maya Collapse
16. 02. 2008
0 views

Late Classic Maya Collapse

ISLAS GALAPAGOS
14. 09. 2007
0 views

ISLAS GALAPAGOS

Heatingoilwebsection ppp
24. 02. 2008
0 views

Heatingoilwebsection ppp

PIndustrialTrucks
26. 02. 2008
0 views

PIndustrialTrucks

ethanap
14. 09. 2007
0 views

ethanap

Propulsion CEV
07. 11. 2007
0 views

Propulsion CEV

MichelleWatt
20. 02. 2008
0 views

MichelleWatt

newsletterfall04
11. 03. 2008
0 views

newsletterfall04

EC T9 2008 Conference Proposal
12. 03. 2008
0 views

EC T9 2008 Conference Proposal

drugstatistics
17. 12. 2007
0 views

drugstatistics

icfa chep06
23. 10. 2007
0 views

icfa chep06

Hubert CW8
14. 09. 2007
0 views

Hubert CW8

A mi Papi 2089
19. 06. 2007
0 views

A mi Papi 2089

An ode to Mothers
19. 06. 2007
0 views

An ode to Mothers

LoffPresentation
17. 10. 2007
0 views

LoffPresentation

Maschera
19. 06. 2007
0 views

Maschera

manual
19. 06. 2007
0 views

manual

Luces De Navidad 1848
19. 06. 2007
0 views

Luces De Navidad 1848

leer
19. 06. 2007
0 views

leer

Lean Six SigmaATL011706
19. 06. 2007
0 views

Lean Six SigmaATL011706

lexisnexis
05. 10. 2007
0 views

lexisnexis

OAT Presentation v5
19. 06. 2007
0 views

OAT Presentation v5

moscatelli
19. 06. 2007
0 views

moscatelli

moon split
19. 06. 2007
0 views

moon split

money plus
19. 06. 2007
0 views

money plus

MKCL
19. 06. 2007
0 views

MKCL

Journey of the Spirit Lesson 6
01. 10. 2007
0 views

Journey of the Spirit Lesson 6

2 Jornada BISHOP
10. 10. 2007
0 views

2 Jornada BISHOP

No esperes
19. 06. 2007
0 views

No esperes

Amores locos 1992
19. 06. 2007
0 views

Amores locos 1992

College English book 2 Unit 7
24. 02. 2008
0 views

College English book 2 Unit 7

A vista de pajaro II 2109
19. 06. 2007
0 views

A vista de pajaro II 2109

Ammosov Vladimir ammosov pra
12. 10. 2007
0 views

Ammosov Vladimir ammosov pra

Amber la mejor de todas
19. 06. 2007
0 views

Amber la mejor de todas

CP317 lecture 6 Huck II 05
11. 12. 2007
0 views

CP317 lecture 6 Huck II 05

AHQA031204Mck
09. 07. 2007
0 views

AHQA031204Mck

Evergreen
03. 01. 2008
0 views

Evergreen

04 NJIT3
02. 01. 2008
0 views

04 NJIT3

Poster A4 Glasgow nov04
04. 10. 2007
0 views

Poster A4 Glasgow nov04

Ally McBeal
09. 07. 2007
0 views

Ally McBeal

sara paige
14. 09. 2007
0 views

sara paige

36181003
24. 10. 2007
0 views

36181003

MusicApprecBaroque 2
22. 11. 2007
0 views

MusicApprecBaroque 2

ELE386 Malware
20. 08. 2007
0 views

ELE386 Malware

RohanShah
12. 10. 2007
0 views

RohanShah

1022MAS net big picture
03. 01. 2008
0 views

1022MAS net big picture

Lo Suficiente 1744
19. 06. 2007
0 views

Lo Suficiente 1744

gm3 jp item14 Mangrove ITTO
22. 10. 2007
0 views

gm3 jp item14 Mangrove ITTO

2005AuditResults
09. 07. 2007
0 views

2005AuditResults

HABIC1 summary
17. 11. 2007
0 views

HABIC1 summary

aro ald informalsession
24. 10. 2007
0 views

aro ald informalsession

etu ambassadeurs juin 07 en
13. 03. 2008
0 views

etu ambassadeurs juin 07 en

Gobert poster
03. 10. 2007
0 views

Gobert poster

Kistenev
15. 11. 2007
0 views

Kistenev

6 History of Chemistry I
12. 10. 2007
0 views

6 History of Chemistry I

Jan2000report
04. 01. 2008
0 views

Jan2000report

course 4
03. 01. 2008
0 views

course 4