Lecture3 NLP grammars parsing

Information about Lecture3 NLP grammars parsing

Published on October 21, 2007

Author: FunnyGuy

Source: authorstream.com

Content

Parsing Natural Languages with Context-free Grammars :  Parsing Natural Languages with Context-free Grammars Martin Volk Computational Linguistics Stockholm University [email protected] The Chomsky Hierarchy:  The Chomsky Hierarchy The Chomsky Hierarchy:  The Chomsky Hierarchy states restrictions on rules. Given: A, B are non-terminals. x is a string of terminals. ,, are arbitrary strings (of terminals and non-terminals). then each rule is of the form: Type 3: A  xB or A  x Type 2: A   Type 1:  A      where  is not empty Type 0: left side of the rule is not empty Context-free grammars:  Context-free grammars (may) have rules like NP  Det N PP  Prep NP cannot have rules like NP PP  PP NP ADV anfangen  fangen ADV an This restriction has implications for the processing resources and speed. Issues:  Issues Why do computational linguists use formal grammars for describing natural languages? Are natural languages context-free languages? Are there grammar formalisms that linguists prefer?  ID/LP-grammars The goal of Natural Language Processing (NLP):  The goal of Natural Language Processing (NLP) Given a natural language utterance (written or spoken): Determine: who did what to whom, when, where, how, why (for what reasons, for what purpose)? Towards this goal: Determine the syntactic structure of an utterance. Steps to syntax analysis:  Steps to syntax analysis For every word in the input string determine its word class. Group all words into constituents. Determine the linguistic functions (subject, object, etc.) of the constituents. Determine the logical functions (agent, recipient, transfered-object, place, time …) An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase verb phrase An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase verb phrase sentence An example:  An example A book was given to Mary by Peter. det noun aux verb prep name prep name noun phrase prep phrase verb group prep phrase verb phrase passive sentence Logical subject Logical object Result:  Result Agent (the giver): Peter The object: a book Recipient: Mary Action: giving When: in the past Via inference Who has a book now? Mary The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Noun_Phrase  Determiner Noun a book the house some houses 50 books Peter’s house The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Adjective_Phrase  Adjective Adjective_Phrase  Adverb Adjective nice nicest very nice hardly finished The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Noun_Phrase  Det Adjective_Phrase Noun a nice book the old house some very old houses 50 green books The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar Prep_Phrase  Preposition Noun_Phrase with a nice book through the old house in some very old houses for 50 green books The context-free rules of a natural language grammar:  The context-free rules of a natural language grammar (may) include recursion (direct and indirect) Examples NP  NP PP # the bridge over the Nile NP  NP Srelative # a student who likes this course Srelative  NP VP # who likes this course Slide19:  a student who likes this course det noun rel-pron verb det noun NP NP NP VP Srel NP Formal Definition of a Context-free Grammar:  Formal Definition of a Context-free Grammar A context-free grammar consists of a set of non-terminal symbols N set of terminals  a set of productions A →  A N, -string  (N)* a designated start symbol (from N) Context-free grammars for natural language:  Context-free grammars for natural language A set of non-terminal symbols N word class symbols (N, V, Adj, Adv, P) linguistic constituent symbols (NP, VP, AdjP, AdvP, PP) A set of terminals  all words of the English language A set of productions A →  the grammar rules (e.g. NP  Det, AdjP, N) A designated start symbol a symbol for the complete sentence How many …?:  How many …? … non-terminals do we need? word class symbols (N, V, Adj, Adv, P) usually between 20 and 50 linguistic constituent symbols (NP, VP, …) usually between 10 and 20 … terminals do we need? words of the English language? Different word stems (see, walk, give) > 50´000 Different word forms (see, sees, saw, seen) > 100´000 How many …?:  How many …? … grammar rules do we need? NP  Name # Mary, Peter NP  Det Noun # a book PP  Prep NP # to Mary VP  V NP PP # gave a book to Mary VP  V NP NP # gave Mary a book Problem: This grammar will also accept: *Peter give Mary a books. # agreement problem *Peter sees Mary a book. # complement problem Agreement: Why bother?:  Agreement: Why bother? *Peter give Mary a books. Consider: Peter threw the books into the garbage can that are old and grey. Peter threw the books into the garbage can that is old and grey. Agreement can help us determine the intended meaning! Agreement: First approach:  Agreement: First approach NPsg  Namesg # Mary, Peter NPsg  Detsg Nounsg # a book NPpl  Detpl Nounpl # the books PP  Prep NPsg # to Mary PP  Prep NPpl # for the books VP  V NPsg NPsg # gave Mary a book VP  V NPsg NPpl # gave Mary the books VP  V NPpl NPsg # gave the kids a book VP  V NPpl NPpl # gave the kids the books Combinatorial explosion … too many rules Agreement: Better approach:  Agreement: Better approach Variables ensure agreement via feature unification. NP[Num]  Name[Num] # Mary, Peter NP[Num]  Det[Num] Noun[Num] # a book, the books PP  Prep NP[X] # to Mary, for the books VP[Num]  V[Num] NP[X] NP[Y] # give Mary a book; gives Mary the books Subcategorization:  Subcategorization Verbs have preferences for the kinds of constituents they co-occur with. For example: VP → Verb (disappear) VP → Verb NP (prefer a morning flight) VP → Verb NP PP (leave Boston in the morning) VP → Verb PP (leaving on Thursday) But not: *I disappeared the cat. Parsing as Search:  Parsing as Search Top-down Parsing Bottom-up Parsing  see Jurafsky slides That sounds nice …:  That sounds nice … … where is the problem? from the Financial Times of Nov. 23. 2004 at http://news.ft.com/home/europe:  from the Financial Times of Nov. 23. 2004 at http://news.ft.com/home/europe McDonald’s CEO steps down to battle cancer By Neil Buckley in New York Published: November 23 2004 00:51 Last updated: November 23 2004 00:51 McDonald's said on Monday night Charlie Bell would step down as chief executive to devote his time to battling colorectal cancer, dealing another blow to the world's largest fast food company. Mr Bell's resignation comes just seven months after James Cantalupo, its former chairman and chief executive, died from a heart attack. McDonald's moved quickly to close the gap, appointing Jim Skinner, currently vice-chairman, to the chief executive's role. Problems when parsing natural language sentences:  Problems when parsing natural language sentences Words that are (perhaps) not in the lexicon. Proper names James Cantalupo, McDonald's, InterContinental, GE Compounded words  need to be segmented kurskamrater, kurslitteratur, kursavsnitt, kursplaneundersökningarna, kursförluster valutakurs, snabbkurs, säljkurser aktiekurser, valutakursindex Foreign language expressions Don Kerr är Mellanösternspecialist på The International Institute for Strategic Studies i London , högt ansedd , oberoende thinktank . Multiword expressions Idioms: to deal another blow Metaphors to battle cancer Problems when parsing natural language sentences:  Problems when parsing natural language sentences Ambiguities Word level (kurs as in valutakurs or kurskamrat) Sentence level He sees the man with the telescope. Old men and women left the occupied city. Additional knowledge sources are needed to resolve ambiguities More world knowledge Statistical knowledge (Parsing preferences) How can we obtain statistical preferences?:  How can we obtain statistical preferences? From a parsed and manually checked corpus (= collection of sentences) Such a corpus is usually a database that contains the correct syntax tree with each sentence (therefore called a treebank). Building a treebank is very time-consuming. Slide34:  Can all the syntax of natural language be described with context-free rules? Are there phenomena in natural language that require context-sensitive rules? Limits of Context-free Grammars:  Limits of Context-free Grammars It is not possible to write a context-free grammar (or to design a Push-Down Automaton (PDA)) for the language L = {anbnan | n > 0} Why? Intuitively: The memory component of a PDA works like a stack. One stack! So, it can only be used to count once. Are natural languages context-free?:  Are natural languages context-free? Yes! But … there is a famous paper about some constructions in Swiss German of the form w an bm x cn dm y Jan säit, das mer (em Hans) (es huus) (hälfed) (aastriiche). Jan säit, das mer (d´chind)n (em Hans)m (es huus) (haend wele laa)n (hälfe)m (aastriiche). but they are rather strange and rare. The claim that they are not context-free relies on the assumption that n and m are unbounded. The notion of ”context”:  The notion of ”context” We need ”context” to understand a natural language utterance! This notion of ”context” is different from the notion of ”context” in the name context-free languages. Slide38:  Do linguists like context-free grammars? Not really … Linguists want …:  Linguists want … to express grammar rules on different abstract levels. For example: Instead of saying: NP  NP Conj NP # the boy and the girl VP  VP Conj VP # sang and danced AdjP  AdjP Conj AdjP # wise and very famous they would like to say: XP  XP Conj XP Linguists want …:  Linguists want … (to be able) to state dominance and precedence separately. Peter dropped the course happily. Happily Peter dropped the course. S  Adv S’ S  S’ Adv Context-free Grammars:  Context-free Grammars Context-free grammar rules encode both Dominance and Precedence information Example: A B C D A dominates B and C and D and B precedes C which in turn precedes D ID/LP-Grammars:  ID/LP-Grammars ID/LP-Grammars have separate rules: ID (Immediate dominance) rules and LP (Linear precedence) rules. Example: ID-rule: A {B, C, D} A dominates B and C and D LP-rule: B < C B precedes C ID/LP Grammars have been proposed in Linguistics, e.g. in Generalized Phrase Structure Grammar (GPSG; by Gazdar, Klein, Pullum, Sag, 1985) ID/LP-Grammars:  ID/LP-Grammars Example from German Gestern hat [VP der Professor der Sekretärin diese Blumen geschenkt]. Gestern hat [VP der Professor diese Blumen der Sekretärin geschenkt]. Gestern hat [VP diese Blumen der Professor der Sekretärin geschenkt]. Gestern hat [VP diese Blumen der Sekretärin der Professor geschenkt]. Gestern hat [VP der Sekretärin der Professor diese Blumen geschenkt]. Gestern hat [VP der Sekretärin diese Blumen der Professor geschenkt]. ID/LP-Grammars:  ID/LP-Grammars The German verb phrase (or Mittelfeld) consists of an NP_nominative an NP_dative an NP_accusative a verb To account for all order variations will require 6 context-free grammar rules, but it requires only one ID-rule plus one LP-rule: VP  {NP_accusative, NP_dative, NP_nominative, V} NP < V ID/LP-Grammars vs. Context-free Grammars:  ID/LP-Grammars vs. Context-free Grammars All ID/LP-grammars can be transformed into strongly equivalent context-free grammars. Some cf grammars cannot be transformed into strongly equivalent ID/LP grammars. Example: The cf grammar consisting of the rule A aca cannot be transformed into a strongly equivalent ID/LP grammar, because of contradictiory ordering constraints: a before c AND c before a An additional non-terminal is required: ID-rules: A {Z,a} Z  {a,c} LP-rules: Z < a a < c Summary:  Summary Why do computational linguists use formal grammars for describing natural languages? As an intermediate step to capture the meaning of natural language utterances. Are natural languages context-free languages? The syntax of natural languages can be described with context-free grammars (in general). What grammar formalisms do linguistics prefer? Linguists want to describe natural language as precise and as comfortable as possible. They prefer grammar formalisms with feature variables, metarules, ID/LP separation, schemata, abstract rules … Any Questions?:  Any Questions?

Related presentations


Other presentations created by FunnyGuy

history
02. 05. 2008
0 views

history

julac2001
28. 04. 2008
0 views

julac2001

sandiego
22. 04. 2008
0 views

sandiego

ch20
18. 04. 2008
0 views

ch20

unepmahomed
16. 04. 2008
0 views

unepmahomed

lecture14
13. 04. 2008
0 views

lecture14

oca
10. 04. 2008
0 views

oca

rexford slides
18. 06. 2007
0 views

rexford slides

Pepsi Presenation
18. 06. 2007
0 views

Pepsi Presenation

MATERIALHANDLING
27. 02. 2008
0 views

MATERIALHANDLING

ny subway
27. 09. 2007
0 views

ny subway

July20 KNN
05. 10. 2007
0 views

July20 KNN

VertebratePPP
10. 10. 2007
0 views

VertebratePPP

Wide World of Animals
10. 10. 2007
0 views

Wide World of Animals

cc
15. 10. 2007
0 views

cc

School for Scandal lecture
17. 10. 2007
0 views

School for Scandal lecture

Profesionesyoficios2
22. 10. 2007
0 views

Profesionesyoficios2

lec6 sensanal 05
05. 10. 2007
0 views

lec6 sensanal 05

PPMcNeely
23. 10. 2007
0 views

PPMcNeely

California Demographics 101
29. 10. 2007
0 views

California Demographics 101

Varying sent struct2
30. 10. 2007
0 views

Varying sent struct2

bio435 660 chap3 pt1
16. 11. 2007
0 views

bio435 660 chap3 pt1

dogscatsrabbits
19. 11. 2007
0 views

dogscatsrabbits

Workshop Marketing
20. 11. 2007
0 views

Workshop Marketing

Sablony
15. 11. 2007
0 views

Sablony

poultry prod
26. 11. 2007
0 views

poultry prod

lublin072806
23. 10. 2007
0 views

lublin072806

hcbscrisis
30. 10. 2007
0 views

hcbscrisis

lecture06
30. 12. 2007
0 views

lecture06

mites
01. 01. 2008
0 views

mites

COE Permit Procedure
03. 01. 2008
0 views

COE Permit Procedure

Miss You
09. 08. 2007
0 views

Miss You

history of malaria 2006
16. 10. 2007
0 views

history of malaria 2006

P0200412274947173437 13
16. 10. 2007
0 views

P0200412274947173437 13

Dalby presentation
31. 10. 2007
0 views

Dalby presentation

Medievallyrics
27. 11. 2007
0 views

Medievallyrics

metalurgy
05. 01. 2008
0 views

metalurgy

The Limits Of Computing
07. 01. 2008
0 views

The Limits Of Computing

Sym06ppt Oram
29. 10. 2007
0 views

Sym06ppt Oram

cdc obesity 04
03. 08. 2007
0 views

cdc obesity 04

LSMD
07. 11. 2007
0 views

LSMD

gatewayspace
15. 10. 2007
0 views

gatewayspace

ENG private sector 2004
23. 11. 2007
0 views

ENG private sector 2004

Lecture8dna
15. 10. 2007
0 views

Lecture8dna

Tim POPL
04. 10. 2007
0 views

Tim POPL

M E Quiz Maria
24. 02. 2008
0 views

M E Quiz Maria

Ch 12 LMIs
29. 10. 2007
0 views

Ch 12 LMIs

FloodLATEST
29. 02. 2008
0 views

FloodLATEST

New Orleans
26. 06. 2007
0 views

New Orleans

M Surdeanu
26. 06. 2007
0 views

M Surdeanu

Mulvey
26. 06. 2007
0 views

Mulvey

law enforcement
19. 02. 2008
0 views

law enforcement

510Women Gender and DDR
06. 03. 2008
0 views

510Women Gender and DDR

NACEpresentation
10. 03. 2008
0 views

NACEpresentation

ec101Chp16
04. 10. 2007
0 views

ec101Chp16

De Toulon aux Canaries
24. 10. 2007
0 views

De Toulon aux Canaries

Tides pp
11. 03. 2008
0 views

Tides pp

Gaved Heath Eisenstadt Wikisym06
12. 03. 2008
0 views

Gaved Heath Eisenstadt Wikisym06

MVD
26. 06. 2007
0 views

MVD

MAGICJFee
22. 10. 2007
0 views

MAGICJFee

Neil Avent
20. 03. 2008
0 views

Neil Avent

The Great Wall of China
25. 03. 2008
0 views

The Great Wall of China

mikey
16. 10. 2007
0 views

mikey

12 hydro wind
07. 04. 2008
0 views

12 hydro wind

How to Apply for a Job
23. 10. 2007
0 views

How to Apply for a Job

Tice Materials 28
03. 01. 2008
0 views

Tice Materials 28

Ecot pres
09. 04. 2008
0 views

Ecot pres

melladoRomeprodNov11 04
31. 10. 2007
0 views

melladoRomeprodNov11 04

intro lect 4
07. 12. 2007
0 views

intro lect 4

peter
18. 06. 2007
0 views

peter

Pay Attention 092904
18. 06. 2007
0 views

Pay Attention 092904

p35 hamilton
18. 06. 2007
0 views

p35 hamilton

Riverside Ballroom Day2 215
18. 06. 2007
0 views

Riverside Ballroom Day2 215

QS plat sg2k
18. 06. 2007
0 views

QS plat sg2k

Prensky 04 07 NCLB post
18. 06. 2007
0 views

Prensky 04 07 NCLB post

pimrc 05 B
18. 06. 2007
0 views

pimrc 05 B

phys 560 Solar Neutrinos
18. 06. 2007
0 views

phys 560 Solar Neutrinos

HSTF Background
01. 11. 2007
0 views

HSTF Background

cis vdotolev
12. 10. 2007
0 views

cis vdotolev

impact of assessment
22. 10. 2007
0 views

impact of assessment

lecture20
28. 12. 2007
0 views

lecture20

poetry
15. 06. 2007
0 views

poetry

Context Future Technology
15. 06. 2007
0 views

Context Future Technology

New Credibility
15. 06. 2007
0 views

New Credibility

Secrets To Scientific Selling
15. 06. 2007
0 views

Secrets To Scientific Selling

Problem Solving Strategies
15. 06. 2007
0 views

Problem Solving Strategies

Power of one Person
15. 06. 2007
0 views

Power of one Person

Jack Dugan
26. 02. 2008
0 views

Jack Dugan

Prot Franjo
22. 11. 2007
0 views

Prot Franjo

pledge
15. 06. 2007
0 views

pledge

NOVE BernabeuMorales
01. 10. 2007
0 views

NOVE BernabeuMorales

vaisanen 181005
07. 10. 2007
0 views

vaisanen 181005

ASOCFILE092003051211 1920
22. 10. 2007
0 views

ASOCFILE092003051211 1920

01 Review of BKK Property Mkt
27. 03. 2008
0 views

01 Review of BKK Property Mkt

2006drt6903cours11
14. 11. 2007
0 views

2006drt6903cours11

abcstrea
19. 11. 2007
0 views

abcstrea

Yang Slid Sho
07. 11. 2007
0 views

Yang Slid Sho

IISR
26. 10. 2007
0 views

IISR

Mon Naughton
26. 06. 2007
0 views

Mon Naughton

praust
21. 11. 2007
0 views

praust

RH vili
18. 06. 2007
0 views

RH vili

DIPRE
17. 10. 2007
0 views

DIPRE

presentation prof liu
15. 10. 2007
0 views

presentation prof liu

Dynafluid Nozzle
03. 01. 2008
0 views

Dynafluid Nozzle