SSA Presentation

Information about SSA Presentation

Published on January 5, 2008

Author: Nastasia

Source: authorstream.com

Content

Investigating the Ancient Meroitic Language Using Statistical Natural Language Techniques: Zipf’s Law and Word Co-Occurrences:  Investigating the Ancient Meroitic Language Using Statistical Natural Language Techniques: Zipf’s Law and Word Co-Occurrences Reginald Smith August 10, 2006 Sudan Studies Association Conference Rhode Island College Meroitic is the language of the ancient kingdom of Kush:  Meroitic is the language of the ancient kingdom of Kush Used for almost six hundred years from 2nd century BCE to 4th century CE Phonetic language written right to left (like Arabic) Transliteration made possible by work of British archaeologist FL Griffith around 1910 Meroitic remains largely undeciphered and an enigma:  Meroitic remains largely undeciphered and an enigma No complete vocabulary is available Some words such as place names, loan words, or simple concepts are known For example or “qore” means king Perhaps or “qes” is Kush Many attempts have been made to understand Meroitic using phonology or comparative linguistics Scholars have tried in vain to find a known language that is a relative (see sources in paper) We wish we had a bilingual text like the Rosetta stone to guide us A new method could use mathematics and linguistics:  A new method could use mathematics and linguistics Statistical natural language processing analyzes the properties of language using a mix of statistics and linguistics There are several properties of languages that are the same in all human languages Certain techniques can also help us possibly infer meanings of words (by relating them to other known words) Zipf’s Law: Frequencies of Words:  Zipf’s Law: Frequencies of Words If you rank order words in a text by how frequent (# of times a word appears) they are (#1 being most frequent) and then relate this to the frequency of the word, you get Zipf’s Law Zipf’s Law: where F is the frequency of a word, C is a constant, R is the rank, and α is known as the power law exponent For all languages α ≈ 1 Zipf Law Graphs:  Zipf Law Graphs When you graph the frequency vs. the rank on a log-log graph (graphing the logarithm of frequency vs. the logarithm of rank) you get a straight line whose slope is α Picture Source: University of Helsinki CS department Zipf line fit on data. The red line is the fitted slope on the data points Does Meroitic follow Zipf’s Law?:  Does Meroitic follow Zipf’s Law? The two graphs below show log-log plots of frequency vs. rank for the Meroitic words in 69 texts. The slopes are shown for each The normal plot counts the words as is. The morpheme out plot split out suffixes like –lowi as the separate words “lo” and “wi” Since it has a slope of nearly -1 the morpheme out model of Meroitic seems to follow Zipf’s Law Normal plot Slope = -0.81 Morpheme out plot Slope = -1.03 So what does this show us (besides graphs):  So what does this show us (besides graphs) Despite the apparently low amount of texts available, our sample of Meroitic is structured just like all other human languages (English, Chinese, etc.) Therefore, even though we don’t know the meaning of the words, we know that the language we have is representative Even though most of our samples are redundant funeral stelae We can then proceed to use other statistical techniques on Meroitic and also compare its statistical features to other languages Step Two: Word Co-occurrence:  Step Two: Word Co-occurrence When words occur together in a text, they are said to co-occur “I am here” has co-occurrence between “I-am” and “am-here” Co-occurrences can tell us about the words if we have enough of them Words that co-occur with the same words often have similar parts of speech or even meanings Can we use word co-occurrence in Meroitic to analyze classes of words? What I did with Meroitic:  What I did with Meroitic I analyzed Meroitic by matching together words that co-occurred with the same types of words For example if you have two sentences: “I eat horses” and “We eat lizards” I match “I” and “We” because they both co-occur with “eat” I also match “horses” and “lizards” because they also co-occur with “eat” (in the opposite direction*) I then graph connected words together and analyze them with software What happens? *Technical note: I actually used undirected edges for co-occurring words in the graph shown on the next page Meroitic Words Graph:  Meroitic Words Graph Four main groups of words form that correspond well to Meroitic categories including positions and titles, verbs, places, and miscellaneous nouns Group 4 Group 2 Group 1 Group 3 Results:  Results Techniques like the word co-occurrence matching can help us categorize Meroitic words that we previously guessed on by mapping them against words we already know the part of speech for Similar statistical techniques may allow us to match words with a similar “meaning” to infer the meanings of some words This is still speculative though Conclusion:  Conclusion Statistical natural language processing is a new approach to Meroitic that could supplement other current efforts in the language Much more work remains to be done, but this new avenue may help us move closer to the goal of understanding this beautiful and mysterious language Acknowledgements: I give my boundless appreciation to Dr. Richard Lobban and Dr. Laurance Doyle for the help and advice they gave me on this paper’s topics

Related presentations


Other presentations created by Nastasia

ICU acquired weakness Ron Jou
24. 10. 2007
0 views

ICU acquired weakness Ron Jou

ppt 14
14. 12. 2007
0 views

ppt 14

griffith seminar 150306
01. 10. 2007
0 views

griffith seminar 150306

Ch 2 Climate Causes of Aridity
03. 10. 2007
0 views

Ch 2 Climate Causes of Aridity

Anti Anxiety Agents
28. 11. 2007
0 views

Anti Anxiety Agents

lect8
04. 12. 2007
0 views

lect8

25271
06. 12. 2007
0 views

25271

GoalSetting
10. 12. 2007
0 views

GoalSetting

Consol Serra
25. 10. 2007
0 views

Consol Serra

3 The Model Treaties
29. 10. 2007
0 views

3 The Model Treaties

L08 MDS
31. 10. 2007
0 views

L08 MDS

CHAP09
02. 11. 2007
0 views

CHAP09

noaa updta isom 2004
05. 11. 2007
0 views

noaa updta isom 2004

oct energy narayana
05. 11. 2007
0 views

oct energy narayana

TS1 2 1
04. 10. 2007
0 views

TS1 2 1

Case Volkswagen
16. 11. 2007
0 views

Case Volkswagen

school pp
19. 11. 2007
0 views

school pp

hedgefunds bubble slides
20. 11. 2007
0 views

hedgefunds bubble slides

fuel tank safety bahrami medal
06. 11. 2007
0 views

fuel tank safety bahrami medal

POA nsrc 2005 02 15
26. 11. 2007
0 views

POA nsrc 2005 02 15

Brand Presentation UofA
23. 11. 2007
0 views

Brand Presentation UofA

who are you
28. 12. 2007
0 views

who are you

WM TB2
31. 12. 2007
0 views

WM TB2

NDM Workshop
04. 01. 2008
0 views

NDM Workshop

chap11a
07. 01. 2008
0 views

chap11a

nutrient murray
07. 01. 2008
0 views

nutrient murray

harmon05
24. 12. 2007
0 views

harmon05

5 coldwar lesson
25. 12. 2007
0 views

5 coldwar lesson

Program Overview2
01. 01. 2008
0 views

Program Overview2

Kaal Dorst Beyond MIP final
21. 11. 2007
0 views

Kaal Dorst Beyond MIP final

Presentation Barbarosie
01. 12. 2007
0 views

Presentation Barbarosie

GRP 12 LA buffet
12. 12. 2007
0 views

GRP 12 LA buffet

dnttour leader english
02. 10. 2007
0 views

dnttour leader english

cg 00 36
20. 02. 2008
0 views

cg 00 36

copyright authorship
27. 02. 2008
0 views

copyright authorship

masterclass
13. 11. 2007
0 views

masterclass

McCreight
05. 03. 2008
0 views

McCreight

Ling Chen
27. 03. 2008
0 views

Ling Chen

RameshNarayanSciWrit ers
29. 11. 2007
0 views

RameshNarayanSciWrit ers

6 2buscycle
13. 04. 2008
0 views

6 2buscycle

CAP June07
15. 11. 2007
0 views

CAP June07

2007 01vadmSullivan
06. 11. 2007
0 views

2007 01vadmSullivan

cabwrcpresentation final
11. 12. 2007
0 views

cabwrcpresentation final

GoetzAccelChange
21. 12. 2007
0 views

GoetzAccelChange

Internet QoS Technique
04. 01. 2008
0 views

Internet QoS Technique

Blue
16. 11. 2007
0 views

Blue

nabucco Sep14 2007
26. 10. 2007
0 views

nabucco Sep14 2007

PPT Mearns Intro2 health 2006
25. 10. 2007
0 views

PPT Mearns Intro2 health 2006

rm2403
30. 10. 2007
0 views

rm2403

FAO Peres Oct06
30. 10. 2007
0 views

FAO Peres Oct06

KateBTina
12. 12. 2007
0 views

KateBTina

urbana web
21. 11. 2007
0 views

urbana web

xml in mozilla
28. 12. 2007
0 views

xml in mozilla

Pablo
07. 11. 2007
0 views

Pablo

cookware energy2
03. 01. 2008
0 views

cookware energy2

ponencia NOM 030
15. 11. 2007
0 views

ponencia NOM 030

La civilisation romaine
30. 10. 2007
0 views

La civilisation romaine

Stata20060629
02. 01. 2008
0 views

Stata20060629

Permanentflowers
07. 12. 2007
0 views

Permanentflowers

s1310 amer yahia
03. 12. 2007
0 views

s1310 amer yahia