C102 Bergallo

Information about C102 Bergallo

Published on October 3, 2007

Author: Pravez

Source: authorstream.com

Content

Applying the Pronunciation Lexicon Specification to ASR & TTS:  Monday, August 20, 2007 SpeechTEK ASTS - Advances in Text-to-Speech Processing Applying the Pronunciation Lexicon Specification to ASR & TTS Patrizio Bergallo Agenda:  Agenda Loquendo Today Introduction to PLS Reference Scenario Pronunciation Lexicons International Phonetic Alphabet Overview of PLS How does TTS use PLS? How does ASR use PLS? Examples of Use Latest Improvements Loquendo Today:  Loquendo Today Global company of the Telecom Italia group, leader in Europe and South America in the Speech Technologies market Company founded in 2001 from Telecom Italia Labs, benefiting from know-how gained from more than 30 years research experience Complete set of Multilingual speech technologies on a wide spectrum of devices; 25 patents, 50 voices and 20 languages Full support for international standards (MRCPv1/v2, VoiceXML 2.0/2.1, CCXML, SSML, SRGS, SISR) Company ready for challenging future scenarios: Multimodality, Security 100 employees, and displayed strong growth throughout 2007 HQ in Turin, Offices in US, Spain, Germany and France, and a Worldwide Network of Partners Reference Scenario:  Reference Scenario Many speech applications need to specify pronunciation for words and phrases Surnames, locations, company names Acronyms Names in specific contexts (restaurants, sports, movie titles, etc.) Foreign words, mixed languages Pronunciation is critical both for TTS and ASR Improves reading of prompts by TTS Improves ASR performance VoiceXML 2.0/2.1 applications are the reference scenario Prompts are based on SSML 1.0 (or in future SSML 1.1) Recognition grammars are based on SRGS 1.0 Pronunciation Lexicons:  Pronunciation Lexicons Pronunciation Lexicon a mapping between words (or short phrases), their written representations, and their pronunciations suitable for use by an ASR engine or a TTS engine Pronunciation lexicons are not only useful for voice browsers They have also proven effective mechanisms to support accessibility for the differently able as well as greater usability for all users They are used to good effect in screen readers and user agents supporting multimodal interfaces The W3C Pronunciation Lexicon Specification (PLS) Version 1.0 is designed to enable interoperable specification of pronunciation lexicons Pronunciation Lexicon Specification:  Pronunciation Lexicon Specification W3C specification status Second Last Call Working Draft (26 October, 2006) Currently the Implementation Report Plan and the Disposition of Comments are under development (all public comments were addressed) Candidate Recommendation expected 3Q07 Part of first version of the Speech Interface Framework (Larson, 2000) W3C Recommendation W3C Last Call Working Draft International Phonetic Alphabet:  International Phonetic Alphabet Pronunciation is represented by a phonetic alphabet Standard phonetic alphabets International Phonetic Alphabet (IPA) Well known phonetic alphabet SAMPA - ASCII based (simple to write) Pinyin (Chinese Mandarin), JEITA (Japanese), etc. Proprietary phonetic alphabets International Phonetic Alphabet (IPA) Created by International Phonetic Association (active since 1896), collaborative effort by all the major phoneticians around the world Universally agreed system of notation for sounds of languages Covers all languages Requires UNICODE to write it Normatively referenced by PLS Overview of PLS:  Overview of PLS A PLS document is a container (<lexicon>) of several lexical entries (<lexeme>) Each lexical entry contains One or more spellings (<grapheme>) One or more pronunciations (<phoneme>) or substitutions (<alias>) Each PLS document is related to a single unique language (xml:lang) SSML 1.0 and SRGS 1.0 documents can reference one or more PLS documents Current version doesn’t include morphological, syntactic and semantic information associated with pronunciations PLS Example:  PLS Example <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2005/01/pronunciationlexicon http://www.w3.org/TR/2007/[email protected]@@@/pls.xsd" alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>Sepulveda</grapheme> <phoneme>səˈpʌlvɪdə</phoneme> </lexeme> <lexeme> <grapheme>W3C</grapheme> <alias>World Wide Web Consortium</alias> </lexeme> </lexicon> How does TTS use PLS?:  How does TTS use PLS? SSML 1.0 <?xml version="1.0" encoding="UTF-8"?> <speak version="1.0" … xml:lang="en-US"> <lexicon uri="http://www.example.com/SSMLexample.pls"/> The title of the movie is: "La vita è bella" (Life is beautiful), which is directed by Benigni. </speak> PLS 1.0 <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>La vita è bella</grapheme> <phoneme>ˈlɑ ˈviːɾə ˈʔeɪ ˈbɛlə</phoneme> </lexeme> <lexeme> <grapheme>Benigni</grapheme> <phoneme>bɛˈniːnji</phoneme> </lexeme> </lexicon> How does ASR use PLS?:  How does ASR use PLS? SRGS 1.0 <?xml version="1.0" encoding="UTF-8"?> <grammar version="1.0" … xml:lang="en-US” root="movies" mode="voice"> <lexicon uri="http://www.example.com/SRGSexample.pls"/> <rule id="movies" scope="public"> <one-of> <item>Terminator 2: Judgment Day</item> <item>Pluto's Judgement Day</item> </one-of> </rule> </grammar> PLS 1.0 <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>judgment</grapheme> <grapheme>judgement</grapheme> <phoneme>ˈdʒʌdʒ.mənt</phoneme> </lexeme> </lexicon> Examples of Use:  Examples of Use Multiple pronunciations for the same orthography Multiple orthographies Homophones Homographs Acronyms, Abbreviations, etc. Multiple pronunciations for the same orthography :  Multiple pronunciations for the same orthography Multiple pronunciations are represented by more than one <phoneme> or <alias> element <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-GB"> <lexeme> <grapheme>Newton</grapheme> <phoneme>ˈnjuːtən</phoneme> <phoneme>ˈnuːtən</phoneme> </lexeme> </lexicon> Multiple orthographies:  Multiple orthographies Alternative textual representations for the same word or phrase are represented by more than one <grapheme> inside the same <lexeme> All the pronunciations given within the <lexeme> apply to each and every <grapheme> within the <lexeme> <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="jp"> <lexeme> <grapheme>nihongo</grapheme> <grapheme>日本語</grapheme> <grapheme>にほんご</grapheme> <phoneme>ɲihoŋo</phoneme> </lexeme> </lexicon> Homophones:  Homophones Words with the same pronunciation but different meanings are represented as different lexemes <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>cede</grapheme> <phoneme>siːd</phoneme> </lexeme> <lexeme> <grapheme>seed</grapheme> <phoneme>siːd</phoneme> </lexeme> </lexicon> Homographs (1/2):  Homographs (1/2) Words with the same spelling but pronounced in different ways are represented using the role attribute of the <lexeme> element This mechanism allows for the referencing of defined taxonomies of word classes (part of speech, meaning, etc.) <lexicon version="1.0“ xmlns:claws=“http://www.example.com/claws7tags” alphabet="x-myorganization-pinyin" xml:lang="zh-CN"> <lexeme role="claws:VV0"> <!-- base form of lexical verb --> <grapheme>处</grapheme> <phoneme>chu3</phoneme> <!-- pinyin string is: "chǔ" in 处罚 处置 --> </lexeme> <lexeme role="claws:NN"> <!-- common noun, neutral for number --> <grapheme>处</grapheme> <phoneme>chu4</phoneme> <!-- pinyin string is: "chù" in 处所 妙处 --> </lexeme> </lexicon> Homographs (2/2):  Homographs (2/2) <speak version="1.1“ xmlns:claws="http://www.example.com/claws7tags" xml:lang="zh-CN"> <lexicon uri="http://www.example.com/lexicon.pls“ type="application/pls+xml“ xml:id="mylex"/> <lookup ref="mylex"> 他这个人很不好相<w role="claws:VV0">处</w>。 此<w role="claws:NN">处</w>不准照相。 </lookup> </speak> SSML 1.1 will support the role attribute Currently PLS doesn’t define/mandate any taxonomy PLS generally defines role values as qualified names (QNames) Acronyms, Abbreviations, etc.:  Acronyms, Abbreviations, etc. Pronunciations expressed as a sequence of other orthographies (acronyms, abbreviations, etc.) are represented by the <alias> element <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>W3C</grapheme> <alias>World Wide Web Consortium</alias> </lexeme> <lexeme> <grapheme>101</grapheme> <alias>one hundred and one</alias> </lexeme> </lexicon> Latest Improvements:  Latest Improvements W3C Last Call Working Draft stage allows public comments to be addressed Large majority were clarifications New functionalities were deferred to a future version of PLS specification Major clarifications were about <alias> recursion Multiple pronunciations Changes are subject to a formal approval by the Working Group Next Steps PLS 1.0 is very close to Candidate Recommendation stage SSML 1.1 will provide a more complete support of PLS 1.0 <alias> recursion:  <alias> recursion Pronunciations of the <alias> element contents MUST be generated by the processor, using pronunciations described by the <phoneme> element of any constituent graphemes in the PLS document, and without invoking recursive access to the PLS document on the <alias> elements of any constituent graphemes <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>GNU</grapheme> <alias>GNU is Not Unix</alias> <phoneme>gəˈnuː</phoneme> </lexeme> <lexeme> <grapheme>Unix</grapheme> <grapheme>UNIX</grapheme> <alias>a multiplexed information and computing service</alias> <phoneme>ˈjuːnɪks</phoneme> </lexeme> </lexicon> GNU is pronounced: gəˈnuː is Not ˈjuːnɪks Multiple pronunciations (1/2):  Multiple pronunciations (1/2) ASR If more than one pronunciation for a given <lexeme> is specified, an ASR processor MUST consider each of them as valid pronunciations for the <grapheme> TTS If more than one pronunciation for a given <lexeme> is specified, a TTS processor MUST use the first one in document order that has the prefer attribute set to "true“ If none of the pronunciations has prefer set to "true", the TTS processor MUST use the first one in document order unless the TTS processor is documented as having a method of selecting pronunciations, in which case the processor MUST use any one of the pronunciations Multiple pronunciations (2/2):  Multiple pronunciations (2/2) An ASR processor will recognize both pronunciations, whereas a TTS processor will only use the first one (because it is the first in document order that has prefer set to "true"). <?xml version="1.0" encoding="UTF-8"?> <lexicon version="1.0" … alphabet="ipa" xml:lang="en-US"> <lexeme> <grapheme>lead</grapheme> <alias prefer="true">led</alias> <phoneme prefer="true">liːd</phoneme> </lexeme> <lexeme> <grapheme>led</grapheme> <phoneme>led</phoneme> </lexeme> </lexicon> References:  References PLS 1.0 Second Last Call Working Draft (26 October, 2006) http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20061026/ Voice Browser Activity Page (VoiceXML, SSML, SRGS, …) http://www.w3.org/Voice/ International Phonetic Association http://www.arts.gla.ac.uk/IPA/ VoiceXML Forum http://www.voicexml.org/ Final Remarks:  Final Remarks THANK YOU For more information please Visit Loquendo’s booth #509 Keep an eye on: www.loquendo.com Contact us: [email protected]

Related presentations


Other presentations created by Pravez

BlueGene
19. 09. 2007
0 views

BlueGene

krul presentation1
22. 04. 2008
0 views

krul presentation1

GRP
17. 04. 2008
0 views

GRP

Howe1 25
14. 04. 2008
0 views

Howe1 25

Panview 3en
10. 04. 2008
0 views

Panview 3en

Global Trends and Prospects
09. 04. 2008
0 views

Global Trends and Prospects

2002 12 22 Christingle
07. 04. 2008
0 views

2002 12 22 Christingle

303 151renewableenergy
30. 03. 2008
0 views

303 151renewableenergy

mechanics of materials
05. 01. 2008
0 views

mechanics of materials

genetic recombination
06. 08. 2007
0 views

genetic recombination

DRAMA
19. 10. 2007
0 views

DRAMA

jong
19. 09. 2007
0 views

jong

text cat tutorial
04. 10. 2007
0 views

text cat tutorial

Tornado Safety
07. 10. 2007
0 views

Tornado Safety

newborn screening
12. 10. 2007
0 views

newborn screening

Switzerland Presentation
17. 10. 2007
0 views

Switzerland Presentation

CP15633
22. 10. 2007
0 views

CP15633

Hack lu cisco ddos
07. 10. 2007
0 views

Hack lu cisco ddos

DNA PowerPoint
15. 10. 2007
0 views

DNA PowerPoint

Membres du Bureau
23. 10. 2007
0 views

Membres du Bureau

23981
12. 10. 2007
0 views

23981

Joao LOBO ANTUNES
16. 10. 2007
0 views

Joao LOBO ANTUNES

Tomatoes
11. 12. 2007
0 views

Tomatoes

C avance
20. 11. 2007
0 views

C avance

Familias Unidas
22. 10. 2007
0 views

Familias Unidas

Module 20
23. 11. 2007
0 views

Module 20

media interview
08. 10. 2007
0 views

media interview

EO1 SysEngrColloq 03 02 04 Stu
05. 10. 2007
0 views

EO1 SysEngrColloq 03 02 04 Stu

hoffman arizona
30. 12. 2007
0 views

hoffman arizona

IA Emergency Preparedness
01. 01. 2008
0 views

IA Emergency Preparedness

Cohen NDIA
03. 01. 2008
0 views

Cohen NDIA

mathematicians
05. 01. 2008
0 views

mathematicians

hamm
23. 10. 2007
0 views

hamm

Prologue
15. 11. 2007
0 views

Prologue

geo inversion
19. 09. 2007
0 views

geo inversion

HIV AIDS Pandemic
06. 08. 2007
0 views

HIV AIDS Pandemic

Jeannie Armour Bearer
06. 08. 2007
0 views

Jeannie Armour Bearer

Hawaiian Cultural History
06. 08. 2007
0 views

Hawaiian Cultural History

jiscdigicon 07par1 1notay
06. 08. 2007
0 views

jiscdigicon 07par1 1notay

ESCpresentation
03. 01. 2008
0 views

ESCpresentation

Unit2 Lecture1
04. 01. 2008
0 views

Unit2 Lecture1

graficos ddet265
24. 10. 2007
0 views

graficos ddet265

Robin Room
07. 01. 2008
0 views

Robin Room

jageps
06. 08. 2007
0 views

jageps

Jewish Initiation
06. 08. 2007
0 views

Jewish Initiation

HIUG
06. 08. 2007
0 views

HIUG

singers
20. 02. 2008
0 views

singers

HGM2006 S7 01 Pedersen
06. 08. 2007
0 views

HGM2006 S7 01 Pedersen

ModuleA5Session3
04. 03. 2008
0 views

ModuleA5Session3

hlt 2004
06. 08. 2007
0 views

hlt 2004

Brussel Health Kuopio KPoutanen
20. 03. 2008
0 views

Brussel Health Kuopio KPoutanen

CIO Main Screens Sydney Final
26. 03. 2008
0 views

CIO Main Screens Sydney Final

Movimento Pentecostal 2 ok
01. 10. 2007
0 views

Movimento Pentecostal 2 ok

guyana csw 2000
06. 08. 2007
0 views

guyana csw 2000

05f 1800 lys a3
19. 02. 2008
0 views

05f 1800 lys a3

Guze 07 Sarah Pedersen
06. 08. 2007
0 views

Guze 07 Sarah Pedersen

Normativa Vittoria marzo06
18. 06. 2007
0 views

Normativa Vittoria marzo06

new Media Activism
18. 06. 2007
0 views

new Media Activism

LKD ACPC2005autoshow
18. 06. 2007
0 views

LKD ACPC2005autoshow

LAS VEGAS PRESENTATION
18. 06. 2007
0 views

LAS VEGAS PRESENTATION

lab storia marcia mille
18. 06. 2007
0 views

lab storia marcia mille

Kreutz
18. 06. 2007
0 views

Kreutz

kinder verzamelingen 05
18. 06. 2007
0 views

kinder verzamelingen 05

kereso halo
18. 06. 2007
0 views

kereso halo

japan overview
18. 06. 2007
0 views

japan overview

jaden
18. 06. 2007
0 views

jaden

2007 dossier pere
18. 06. 2007
0 views

2007 dossier pere

2005 NAIDC Promo
18. 06. 2007
0 views

2005 NAIDC Promo

1212 SLIDES CSR Briefing final1
18. 06. 2007
0 views

1212 SLIDES CSR Briefing final1

07PSPTribute
18. 06. 2007
0 views

07PSPTribute

bioprez1
17. 11. 2007
0 views

bioprez1

Chapter18
19. 09. 2007
0 views

Chapter18

LUXva
18. 06. 2007
0 views

LUXva

f38 math meets maps
15. 06. 2007
0 views

f38 math meets maps

Extraction of Beam into TT40
15. 06. 2007
0 views

Extraction of Beam into TT40

Experiencing HaShem
15. 06. 2007
0 views

Experiencing HaShem

Euthyphro
15. 06. 2007
0 views

Euthyphro

ethnic humor
15. 06. 2007
0 views

ethnic humor

Essentials of UP 1
15. 06. 2007
0 views

Essentials of UP 1

Catv2
15. 06. 2007
0 views

Catv2

Cartoons pig
15. 06. 2007
0 views

Cartoons pig

cartoons stick men 1
15. 06. 2007
0 views

cartoons stick men 1

attr 159897 115
12. 10. 2007
0 views

attr 159897 115

kskee
19. 09. 2007
0 views

kskee

akrishna
28. 11. 2007
0 views

akrishna

18cm
18. 06. 2007
0 views

18cm

BrazilSeminar
22. 10. 2007
0 views

BrazilSeminar

PRSarahL
13. 03. 2008
0 views

PRSarahL

prezent1
15. 11. 2007
0 views

prezent1

power law tut
19. 09. 2007
0 views

power law tut

EthicalEgoism
15. 06. 2007
0 views

EthicalEgoism

petersuter ppt
17. 10. 2007
0 views

petersuter ppt

Internet Equivalence
06. 08. 2007
0 views

Internet Equivalence

slogany
21. 11. 2007
0 views

slogany

Blue Chalkboard
19. 09. 2007
0 views

Blue Chalkboard

Gogol at the Edge of Being
27. 09. 2007
0 views

Gogol at the Edge of Being

ScottKsander10 10
02. 11. 2007
0 views

ScottKsander10 10

labrolazegbe
18. 06. 2007
0 views

labrolazegbe

chap12sp06 groups day 3
14. 12. 2007
0 views

chap12sp06 groups day 3

Resources Potpourri Final2006
26. 10. 2007
0 views

Resources Potpourri Final2006

powers tami portia topics
19. 09. 2007
0 views

powers tami portia topics

APGridPMA EU 200605
10. 10. 2007
0 views

APGridPMA EU 200605

Higgins Webinar Slides 070425
19. 09. 2007
0 views

Higgins Webinar Slides 070425