NR6 1

Information about NR6 1

Published on March 4, 2008

Author: Moorehead

Source: authorstream.com

Content

Language and Tools for Lexical Resource Management :  Language and Tools for Lexical Resource Management Asanee Kawtrakul (1) Aree Thunkijjanukij (2) Preeda Lertpongwipusana(1) Poonna Yospanya(1) (1)Department of Computer Engineering, Faculty of Engineering, (2) Thai National AGRIS center Kasetsart University Acknowledgement:  Acknowledgement JIRCUS: Japan International Research Center for Agricultural Sciences Organizing committee Kasetsart University Outline:  Outline Background & Motivation Problems in Lexical Resource Preparation Requirements for Lexical Resource Management Proposed Language and tools Conclusion and Next steps Background and Motivation :  Background and Motivation Thailand is the agricultural basis country having a rich knowledge and data in agricultural field, A great quantity of agricultural information was scattered in unstructured and unrelated text Skimming/Digesting and integrating becomes essential Knowledge is around the world Knowledge Discovery without language barrier is also needed The Basic Idea behind..:  The Basic Idea behind.. Graphical User Interface Textual Data as a Input:  Textual Data as a Input Let us focus on Canada’s agricultural products. In 1998, there were 1,216 registered commercial egg producers in Canada. Ontario produced 39.8% of all eggs in Canada, Quebec was second with 16.6%. The western provinces have a combined egg production of 35.6% and the eastern provinces have a combined production of 8.0%. With a courtesy of Agriculture and Agri-Food Canada, http://www.agr.ca/cb Summarization and Translation as a Result:  Summarization and Translation as a Result The Development of Agricultural System for Knowledge Acquisition and Dissemination:  The Development of Agricultural System for Knowledge Acquisition and Dissemination 5 years Project (2001-2005) The Collaborative work between: Thai National AGRIS center: Providing Bilingual Thesaurus (AGROVOC) Department of Computer Engineering Developing NLP techniques for Searching, Summarizing and Translation including tools for lexical resource management Funded by Kasetsart University Research and Development Institution Slide9:  Acquisition System Rules Thesaurus Lexicon Linguist/Domain Expert Very Large Corpus Document Indexing & Clustering Linguistic Knowledge Base Intelligent Search Engine With Translation With Summarization Gathering Module Internet/Intranet Thai Agricultural Thesaurus:  Thai Agricultural Thesaurus Total number of English vocabulary is 27,531 terms Translate in to Thai only 10,280 terms (except scientific names) Scientific name were not be translated ex. Oryza (genus) sativa (specy) of rice or family Problem in hand-coded Thesaurus :  Problem in hand-coded Thesaurus Scalability Reliability and Coherence Rigidity Cost Slide12:  Foods Bakery Product Deistic Foods Frozen Foods Fermented Foods Processed Products Canned Products Dried Products Frozen Products Fermented Products Alcoholic Beverage milk Fermented Foods Fermented Fish Slide13:  Foods Fermented Foods Processed Products Local Product Products Fermented Fish Slide14:  Commercial Vegetables: The September index, at 107, was up 1.9 percent from last month but 3.6 percent below September 1998. Price increases for lettuce, tomatoes, broccoli, and celery more than offset price decreases for onions, carrots, and cucumbers Commercial Vegetable tomatoes Broccoli Carrots Cucumbers Slide15:  tomato tomatoes Keyword Assigned Commercial Vegetable broccoli carrot tomato User Category Other Major Problems (1):  Other Major Problems (1) Accessing to textual information Language variation: Many ways to express the same idea Ex: thinning flower uses deblossoming thinning branch uses pruning how the computer can know that words a person uses are related to words found in stored text? Ex: user: thinning branch computer: pruning Requirement (1):  Requirement (1) Accessing to textual information Need intelligent browsing from related concept to related concept, rather than from occurrence of stemmed character strings Other Major Problems (2):  Other Major Problems (2) Transforming from unstructured to structured information Requirement (2):  Requirement (2) Need Application-based Frame about product price Knowledge representation in table form Consisting of attributes and their values Attributes Values Problems in Translation: Pragmatic and Semantic:  Problems in Translation: Pragmatic and Semantic The September All Farm Products Index was 97 percent of its 1990-92 base, down1.0 percent from the August index and 2.0 percent below the September 1998 Index Using Ontology 0.97* averagePrice of year from1990-1992 September Of year ?? August Year1997 Down 0.02*price(September 1998) “Year 1990-1992” meaning:  “Year 1990-1992” meaning Requirement (3):  Requirement (3) Lexicon should having the semantic constraints between lexical entities, restriction on usage categories Summary of Problems related to lexicon:  Summary of Problems related to lexicon In terms of coverage Extensional coverage, i.e., number of entries Intensional coverage, i.e., the number of information fields In terms of semantic domain covered by the application Meaning Interpretation with respect to objects, subject matter, topics of discourse, and pragmatic interpretation The user category with reference to the intended system users Commercial product vs Plant products vs Family products One Solution:  One Solution Encoding world knowledge in the structures attached to each lexical item which needs both language and tools The Design of Lexicon: Requirement Specification:  The Design of Lexicon: Requirement Specification Macrostructure: Lexicon structure in terms of relations between lexical entries i.e. Hierarchical taxonomies which are characteristic of thesauri of semantically related word family Microstructure: types of information for each entry Pronunciation or phonemic transcription Syntactic properties Meaning Pragmatics of their use in real context and language Microstructure (cont’):  Microstructure (cont’) Lexical entity could contain slots/scripts for each specific domain and need intelligent Analyzer and understanding language Supplies information extraction Supplies the missing value Lexical Resource Management Language:  Lexical Resource Management Language which is able to: Handle heterogeneity of linguistic knowledge structures. Handle exceptions and inconsistencies of natural languages. Provide an intuitive means to store and manipulate both linguistic and world knowledge. Language Features:  Language Features The language is designed in a way that will enable: Supports for heterogeneous structures. Sufficient provisions to handle exceptions and inconsistencies of natural languages (this is achieved through the +/- operators). Deduction of knowledge from rules. Detection and prevention of potential integrity violations. Language and Tools Specification requirement:  Language and Tools Specification requirement Flexibility – almost any structures can be defined in this model. Extensibility – extending a structure is simple. Maturability – structure reformation and deformation are supported. Integrity – meta-relations help prevent malformed or ill-semantic data entries. Dealing with inconsistencies is feasible. Some Syntactic Elements :  Some Syntactic Elements Knowledge manipulations are achieved through these primitives: def is used to define structures not already existing. redef changes aspects of existing structures. undef removes specified structures from the knowledge base. ret is used to retrieve structures from the knowledge base. Examples:  Examples Hierarchies: tree structures representing generalization semantics, or classes, of atoms. thing animate inanimate animal human A semantic tree represented by a hierarchy structure Usage Examples:  Usage Examples Defining a hierarchy def thing(animate(human+animal)+inanimate). Adding the ‘plant’ and ‘vehicle’ concepts def animate(plant+vehicle). Reparenting the ‘vehicle’ concept redef animate(vehicle) inanimate(vehicle). Removing the ‘human’ concept undef human. (provided that there is only a single instance of ‘human’) Usage Examples (2):  Usage Examples (2) Defining case frames for verbs First, we need to define meta-relations for words belonging to the sub-hierarchy ‘verb’. def meta case(verb, sub:thing). def meta case(verb, sub:thing, obj:thing). Then, we define case frames for several verbs. def case(eat, sub:human+animal, obj:food). def case(fly, sub:bird-penguin). (here, we emphasize the use of +/- operators) Hierarchy & Set:  Hierarchy & Set c1 w1 w7 c2 w2 p1 w6 f1 f2 f3 f4 c3 w4 w5 w3 Defining a Hierarchy:  Defining a Hierarchy c1 w1 w7 c2 w2 p1 w6 w4 w5 w3 def c1(“w1”(“w3”)+c2(“w4”)+“w2”). def “w5”+“w6” under “w4”. def “p1”(“w7”) under “w2”. Manipulating the Hierarchy:  Manipulating the Hierarchy c1 w1 w7 c2 w2 p1 w6 w4 w5 w3 redef “w4” under “w2”. undef “w1”. Defining a Set:  Defining a Set f1 f2 f3 f4 c3 def c3{[f1]+[f2]+[f3]}. def [f4] in c3. Defining a Relation:  Defining a Relation c2 w6 f1 f2 f3 f4 c3 w4 w5 def meta r1(c2, c3). Template defined. r1’ def r1(“w4”, [f1]). Relation defined. r1 c2 w1 def r1(“w1”, [f3]). Constraint violated. Definition not allowed. inherited Synset & Surrogates:  Synset & Surrogates A synset is an unnamed set identified by its unique ID. Members of a synset are considered synonymous with different degrees of synonymity. Distance graph is automatically constructed within a synset with surrogates being representatives of synset members. Entities with identical features are attached to the same surrogates. Synset & Surrogates:  Synset & Surrogates s1 s4 s2 s3 s5 w2 w1 p2 p3 w3 w4 w6 p1 f2 f4 f4 f3 f3 f4 f1 f1 f1 f3 f2 f1 f4 synset#1 surrogate network internally constructed Synset & Multilingual Lexicon:  Synset & Multilingual Lexicon Synset members are not confined within language scope, that is, entities from different language may belong to the same synset. Distance matrix are computed from number of different features over each pair of surrogates. Traversing from a word to nearest-distant words is handled by the system. We can determine words with potentially nearest semantics here. Expected Result:  Expected Result Slide43:  Keyword Generated Slide44:  Keyword Generated “Fruit vegetable”,red Slide45:  tomatoes VEGETTABLES BT Expert Domain Keyword Generated “Fruit vegetable”,red Slide46:  tomatoes VEGETTABLES BT Keyword Generated “Fruit vegetable”,red BROCCOLI type=leaf vegetable color=green Expert Domain Slide47:  tomatoes VEGETTABLES BT Expert Domain Keyword Generated “Fruit vegetable”,red Sweet pepper BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow Slide48:  tomatoes VEGETTABLES BT Expert Domain Keyword Generated “Fruit vegetable”,red Sweet pepper Tomatoes BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow Slide49:  tomatoes VEGETTABLES BT Expert Domain CHERRY TOMATOES type=fruit vegetable NT color=red Keyword Generated “Fruit vegetable”,red Sweet pepper Tomatoes Cherry Tomatoes BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow Slide50:  tomatoes VEGETTABLES BT Expert Domain CHERRY TOMATOES type=fruit vegetable NT color=red Keyword Generated “Fruit vegetable”,red Sweet pepper Tomatoes Cherry Tomatoes BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow RT LYCOPERSICON ESCULENTUM type=taxonomic SOLANACEAE CAPSICUM NICOTIANA BT NT color=red Slide51:  Keyword Generated Slide52:  Keyword Generated “Plant in same family” Slide53:  tomatoes VEGETTABLES BT Expert Domain CHERRY TOMATOES type=fruit vegetable NT color=red Keyword Generated “Plant in same family” Capsicum BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow RT LYCOPERSICON ESCULENTUM type=taxonomic SOLANACEAE CAPSICUM BT NT color=red Slide54:  tomatoes VEGETTABLES BT Expert Domain CHERRY TOMATOES type=fruit vegetable NT color=red Keyword Generated “Plant in same family” Capsicum Nicotiana BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow RT LYCOPERSICON ESCULENTUM type=taxonomic SOLANACEAE CAPSICUM NICOTIANA BT NT color=red Slide55:  tomatoes VEGETTABLES BT Expert Domain CHERRY TOMATOES type=fruit vegetable NT color=red Keyword Generated “Plant in same family” Capsicum Nicotiana BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow RT LYCOPERSICON ESCULENTUM type=taxonomic SOLANACEAE CAPSICUM NICOTIANA BT NT color=red Slide56:  tomatoes VEGETTABLES BROCCOLI type=leaf vegetable color=green SWEET PEPPER type=fruit vegetable color=red, green, yellow TOMATOES type=fruit vegetable color=red, yellow CHERRY TOMATOES type=fruit vegetable LYCOPERSICON ESCULENTUM type=taxonomic BT NT RT SOLANACEAE CAPSICUM NICOTIANA BT NT Expert Domain color=red color=red tomato tomatoes Keyword Assigned Commercial Vegetable broccoli carrot tomato User Category Keyword Generated tomato Tomato Tomatoes Cherry Tomatoes Conclusion and Next steps:  Conclusion and Next steps This is a preliminary introduction of the language, with a few of its many possibilities. Structures not mentioned in details here have not yet been firmly specified. These structures are rules, maps, and contexts, which are incorporated to extend the potentials in handling deductions, multilingual operations, domain-dependent retrievals, etc. Next Steps:  Next Steps Revise the Idea Continue the Implementation Aligner Tool GUI tools for Thesaurus maintenance Short - term solutions to language variability problems by exploiting available knowledge sources with available techniques Long-range approach need high quality language understanding , i.e., Automatic thesaurus construction System of Agricultural Information Summarization and Translation Thank you:  Thank you

Related presentations


Other presentations created by Moorehead

gis technology2
16. 11. 2007
0 views

gis technology2

customs courtesies
28. 02. 2008
0 views

customs courtesies

bfahome
04. 10. 2007
0 views

bfahome

CSW04 DDoSWormsUnderground v1
10. 12. 2007
0 views

CSW04 DDoSWormsUnderground v1

Hour8
02. 11. 2007
0 views

Hour8

Big 12 Lean Approach 2
07. 11. 2007
0 views

Big 12 Lean Approach 2

legal
14. 11. 2007
0 views

legal

B2 Vitev
15. 11. 2007
0 views

B2 Vitev

Rabies 3
20. 11. 2007
0 views

Rabies 3

Sep 2005 Hispanic Heritage
21. 11. 2007
0 views

Sep 2005 Hispanic Heritage

lecture3B
08. 11. 2007
0 views

lecture3B

HISTORY OF RUSSIA1
23. 11. 2007
0 views

HISTORY OF RUSSIA1

k 12toolkit
29. 11. 2007
0 views

k 12toolkit

08 BdM 3e WT
13. 12. 2007
0 views

08 BdM 3e WT

AHDA2005 White
25. 12. 2007
0 views

AHDA2005 White

India Since Indepencence
01. 01. 2008
0 views

India Since Indepencence

ZhangJianbo
12. 10. 2007
0 views

ZhangJianbo

CP17284
27. 12. 2007
0 views

CP17284

01Tipos de cuerpos
02. 01. 2008
0 views

01Tipos de cuerpos

Master Winter QPB
28. 11. 2007
0 views

Master Winter QPB

ENGR 4060 Italy1stmtg
01. 11. 2007
0 views

ENGR 4060 Italy1stmtg

lecture01 welcome
30. 12. 2007
0 views

lecture01 welcome

wah
07. 01. 2008
0 views

wah

handout 185951
05. 11. 2007
0 views

handout 185951

Tema 02 Tareas GQ
16. 11. 2007
0 views

Tema 02 Tareas GQ

China Presentation
11. 10. 2007
0 views

China Presentation

Filarial Nematodes
19. 11. 2007
0 views

Filarial Nematodes

FirstProgresMeeting
01. 12. 2007
0 views

FirstProgresMeeting

403
14. 03. 2008
0 views

403

alex
30. 12. 2007
0 views

alex

Gussmagg COLD
18. 03. 2008
0 views

Gussmagg COLD

CITIgroup
21. 03. 2008
0 views

CITIgroup

ExploreFair
26. 03. 2008
0 views

ExploreFair

mark robertson blackwell
27. 03. 2008
0 views

mark robertson blackwell

2007 MacCracken
07. 04. 2008
0 views

2007 MacCracken

lect17
01. 01. 2008
0 views

lect17

7th October presentation 1
24. 02. 2008
0 views

7th October presentation 1

7274117
30. 03. 2008
0 views

7274117

WCU07 SlipsTrips Falls
04. 01. 2008
0 views

WCU07 SlipsTrips Falls

alena
09. 04. 2008
0 views

alena

history lecture2
10. 04. 2008
0 views

history lecture2

CYDE 3
13. 04. 2008
0 views

CYDE 3

El Codigo da Vinci
05. 01. 2008
0 views

El Codigo da Vinci

DECA Prep for Competitive Events
22. 04. 2008
0 views

DECA Prep for Competitive Events

JSC Energizer 12 Days of Fitness
05. 12. 2007
0 views

JSC Energizer 12 Days of Fitness

NPAPSP Analysis
20. 11. 2007
0 views

NPAPSP Analysis

Slides3
04. 01. 2008
0 views

Slides3

science
10. 10. 2007
0 views

science

AFD 070926 096
05. 11. 2007
0 views

AFD 070926 096

SLNspring2003
27. 09. 2007
0 views

SLNspring2003

XwalkProgress
12. 12. 2007
0 views

XwalkProgress

issues to watch 2006
10. 10. 2007
0 views

issues to watch 2006

pdhpe in cogs 07
24. 11. 2007
0 views

pdhpe in cogs 07

Novenato StEdmund
17. 06. 2007
0 views

Novenato StEdmund

Nash1
17. 06. 2007
0 views

Nash1

nanwise august06
17. 06. 2007
0 views

nanwise august06

names yalit
17. 06. 2007
0 views

names yalit

names rowling
17. 06. 2007
0 views

names rowling

names ethnic
17. 06. 2007
0 views

names ethnic

Nairne MC CH07
17. 06. 2007
0 views

Nairne MC CH07

MMR v2004PPT
17. 06. 2007
0 views

MMR v2004PPT

mitacf lg 20060929 vision
17. 06. 2007
0 views

mitacf lg 20060929 vision

oef 9
17. 06. 2007
0 views

oef 9

oef 8 opl
17. 06. 2007
0 views

oef 8 opl

oef 7 opl
17. 06. 2007
0 views

oef 7 opl

oef 6 opl
17. 06. 2007
0 views

oef 6 opl

oef 5 opl
17. 06. 2007
0 views

oef 5 opl

oef 3
17. 06. 2007
0 views

oef 3

oef 2
17. 06. 2007
0 views

oef 2

frederic
05. 11. 2007
0 views

frederic

pasher orgs and fun
17. 06. 2007
0 views

pasher orgs and fun

parody 000
17. 06. 2007
0 views

parody 000

Orientation Leaders 2007
17. 06. 2007
0 views

Orientation Leaders 2007

OOP SLA99
17. 06. 2007
0 views

OOP SLA99

off train
17. 06. 2007
0 views

off train

pbr04 forecasts
17. 06. 2007
0 views

pbr04 forecasts

oef 8
17. 06. 2007
0 views

oef 8

oef 7
17. 06. 2007
0 views

oef 7

oef 6
17. 06. 2007
0 views

oef 6

oef 5
17. 06. 2007
0 views

oef 5

oef 4 opl
17. 06. 2007
0 views

oef 4 opl

oef 4
17. 06. 2007
0 views

oef 4

oef 3 opl
17. 06. 2007
0 views

oef 3 opl

oef 1 opl
17. 06. 2007
0 views

oef 1 opl

oef 1
17. 06. 2007
0 views

oef 1

6 APO
28. 12. 2007
0 views

6 APO

eva dimmock stim4 wok fields 04
06. 12. 2007
0 views

eva dimmock stim4 wok fields 04

soportes ago06
16. 11. 2007
0 views

soportes ago06

bORrB
03. 01. 2008
0 views

bORrB