Yihong

Information about Yihong

Published on October 2, 2007

Author: Octavio

Source: authorstream.com

Content

Semiautomatic Generation of Resilient Data Extraction Ontologies :  Semiautomatic Generation of Resilient Data Extraction Ontologies Yihong Ding Data Extraction Group Brigham Young University Sponsored by NSF Data Extraction Ontology:  Data Extraction Ontology Goal: extract data from web pages Components concepts relations between the concepts participation constraints Resilient Difficulty: manual ontology generation is costly Generation Procedure:  Generation Procedure Knowledge Sources Knowledge Selection Processing Extraction Processing Database Train Test Knowledge Collection:  Knowledge Collection Assumptions about knowledge base general contains meaningful relationships pre-existing XML or easy to transfer to XML Current input Mikrokosmos ontology [Mik] auxiliary data frame library Selection of Concepts:  Selection of Concepts PROCEDURE ConceptSelection(Tdoc, Kbase) SourceDoc = Parse(Tdoc); PrimarySelectedConceptsList = MikroSelection(M-Ontology); SecondarySelectedConceptsList = DataFrameSelection(DF-Library); ConflictHandling(); SelectedSubgraphGeneration(); MANY ISSUES selection strategies, conflict resolution, … Basic Selection Strategy:  Basic Selection Strategy Select from Mikrokosmos Ontology Afghanistan smaller than Texas. Area: 648,000 sq. km. Capital--Kabul, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population:17.7 million. Agriculture: Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Slide7:  Basic Selection Strategy Select from Mikrokosmos Ontology concept names and their synonyms Afghanistan smaller than Texas. Area<GeographicalArea>: 648,000 sq. km. Capital<CapitalCity><FinancialCapital>--Kabul, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population<Population>:17.7 million. Agriculture:Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Slide8:  Basic Selection Strategy Select from Mikrokosmos Ontology concept names and their synonyms concept values and their synonyms Afghanistan<Nation> smaller than Texas<USState>. Area<GeographicalArea>: 648,000 sq. km. Capital<CapitalCity><FinancialCapital>--Kabul<CapitalCity>, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population<Population>:17.7 million. Agriculture:Wheat<FoodStuff><AgriculturalProduct>, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Slide9:  Basic Selection Strategy Select from Mikrokosmos Ontology concept names and their synonyms concept values and their synonyms Select from Data Frame Libraries Afghanistan smaller than Texas. Area: 648,000 sq. km. Capital--Kabul, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population:17.7 million. Agriculture: Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Slide10:  Basic Selection Strategy Select from Mikrokosmos Ontology concept names and their synonyms concept values and their synonyms Select from Data Frame Libraries extract result based on the data frames Afghanistan smaller than Texas. Area: 648,000<Area><Mileage> sq. km. Capital--Kabul, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population:17.7<Time> million<Population><Price>. Agriculture: Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Document-Level Conflict:  Document-Level Conflict Afghanistan smaller than Texas. Area: 648,000<Area><Mileage> sq. km. Capital<CapitalCity><FinancialCapital>--Kabul<CapitalCity>, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population:17.7<Time> million<Population><Price>. Agriculture: Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Concept-Level Conflict:  Concept-Level Conflict Afghanistan smaller than Texas. Area<GeographicalArea>: 648,000<Area> sq. km. Capital--Kabul, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population<Population>: 17.7 million<Population>. Agriculture: Wheat<FoodStuff><AgriculturalProduct>, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. Relation Retrieval:  Relation Retrieval Theoretical solution all paths in the subgraph too expensive: NP-Complete Heuristic solution find the shortest path between any two nodes set a threshold distance Participation Constraints:  Participation Constraints Afghanistan<Nation> smaller than Texas. Area: 648,000 sq. km. Capital—Kabul<CapitalCity>, Other cities--Kandahar Mazar-e-Sharif Konduz Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population: 17.7 million. Agriculture: Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. CapitalCity [1:1] IsA.CITY.PartOf Nation [1:1] Participation Constraints (cont.):  Participation Constraints (cont.) Afghanistan<Nation> smaller than Texas. Area: 648,000 sq. km. Capital--Kabul<City>, Other cities<City>--Kandahar<City> Mazar-e-Sharif<City> Konduz<City> Terrain: Landlocked; mostly mountains and desert. Climate: Dry, with cold winters and hot summers. Population: 17.7 million. Agriculture: Wheat, corn, barley,rice, cotton, fruit, nuts, karakul pelts, wool, mutton. City [1:1] PartOf Nation [1:*] Performance Evaluation:  Performance Evaluation Speed of generation Precision and recall of the generation process Precision and recall of the generated ontology Generation Time with Distance Threshold:  Generation Time with Distance Threshold P&R of Generation Process:  P&R of Generation Process Conclusion:  Conclusion Data Extraction Ontology generated Knowledge sources exploited Many issues applied Many more to explore

Related presentations


Other presentations created by Octavio

Moringa
28. 11. 2007
0 views

Moringa

VISCO IDN Gold WebEx
18. 04. 2008
0 views

VISCO IDN Gold WebEx

aghion pp
13. 04. 2008
0 views

aghion pp

MorganStanley 022504
10. 04. 2008
0 views

MorganStanley 022504

Temp Radn
07. 04. 2008
0 views

Temp Radn

Ed gallagher
30. 03. 2008
0 views

Ed gallagher

orozco
27. 03. 2008
0 views

orozco

Free flaps slides 051019
26. 03. 2008
0 views

Free flaps slides 051019

040605 Mob HvdB
21. 03. 2008
0 views

040605 Mob HvdB

TEAT Self study PPT 2007
18. 03. 2008
0 views

TEAT Self study PPT 2007

Funny images with music
27. 08. 2007
0 views

Funny images with music

Housing
28. 09. 2007
0 views

Housing

4050 Design Pattern
12. 09. 2007
0 views

4050 Design Pattern

strategy acquisition
12. 09. 2007
0 views

strategy acquisition

Summerhill
27. 09. 2007
0 views

Summerhill

100Ideas
06. 11. 2007
0 views

100Ideas

Argentinien
15. 11. 2007
0 views

Argentinien

Heritage
07. 10. 2007
0 views

Heritage

clinical progress
16. 11. 2007
0 views

clinical progress

undergraduate thesis defense
21. 11. 2007
0 views

undergraduate thesis defense

chap3lecture
23. 11. 2007
0 views

chap3lecture

Exercise14
14. 08. 2007
0 views

Exercise14

Maryland Exiles Rugby Club small
14. 08. 2007
0 views

Maryland Exiles Rugby Club small

LISA2001
14. 08. 2007
0 views

LISA2001

rhymes
14. 08. 2007
0 views

rhymes

PNR Meeting NovDec 2005
14. 08. 2007
0 views

PNR Meeting NovDec 2005

Solar Hot Water
09. 11. 2007
0 views

Solar Hot Water

cioffidsm
28. 11. 2007
0 views

cioffidsm

TURKEY
28. 12. 2007
0 views

TURKEY

Mooney 2004
29. 12. 2007
0 views

Mooney 2004

vtls
07. 10. 2007
0 views

vtls

A Space Called School 2
03. 01. 2008
0 views

A Space Called School 2

Vote Presentation
12. 09. 2007
0 views

Vote Presentation

Lenin
14. 12. 2007
0 views

Lenin

act1
07. 01. 2008
0 views

act1

aish
03. 08. 2007
0 views

aish

Embryo Jayan Goa
01. 10. 2007
0 views

Embryo Jayan Goa

memory jungle
14. 08. 2007
0 views

memory jungle

584 stigmergy
31. 12. 2007
0 views

584 stigmergy

DixieAlley megams
05. 10. 2007
0 views

DixieAlley megams

110Ch02
12. 09. 2007
0 views

110Ch02

diggov email geospatial
12. 09. 2007
0 views

diggov email geospatial

8591
25. 12. 2007
0 views

8591

lerman review
04. 01. 2008
0 views

lerman review

Samsung
19. 02. 2008
0 views

Samsung

Tao Jiang NHC2005
24. 02. 2008
0 views

Tao Jiang NHC2005

Fisher CVN 21
06. 03. 2008
0 views

Fisher CVN 21

Tema11
20. 11. 2007
0 views

Tema11

BalamInfoPrescompres sed
14. 08. 2007
0 views

BalamInfoPrescompres sed

Leaders Summit Spring 2007
11. 12. 2007
0 views

Leaders Summit Spring 2007

Vietnam1
28. 12. 2007
0 views

Vietnam1

rtdsofts
07. 11. 2007
0 views

rtdsofts

Bo Diddly Bop
12. 09. 2007
0 views

Bo Diddly Bop

Greenleaf
29. 11. 2007
0 views

Greenleaf

The Deaths of 2006
17. 06. 2007
0 views

The Deaths of 2006

sps0501
17. 06. 2007
0 views

sps0501

Sport SHOW
17. 06. 2007
0 views

Sport SHOW

sms banking demo
17. 06. 2007
0 views

sms banking demo

CPS Oct 4 2004 RBSb
07. 12. 2007
0 views

CPS Oct 4 2004 RBSb

years 5 to 12
14. 08. 2007
0 views

years 5 to 12

balta
23. 11. 2007
0 views

balta

aas asif jan02
13. 11. 2007
0 views

aas asif jan02

Fichner Rathus CH15
01. 11. 2007
0 views

Fichner Rathus CH15

stallone accardo pantaleo
17. 06. 2007
0 views

stallone accardo pantaleo

Nova
02. 01. 2008
0 views

Nova

adato fordham seminar
21. 11. 2007
0 views

adato fordham seminar

gray
04. 03. 2008
0 views

gray

200303 RiskMgmt TVera
14. 08. 2007
0 views

200303 RiskMgmt TVera

PubForum Tritsch TSInternals
16. 11. 2007
0 views

PubForum Tritsch TSInternals