D301 Patel

Information about D301 Patel

Published on October 29, 2007

Author: Christian

Source: authorstream.com

Content

Slide1:  MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc [email protected] Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco ABSTRACT:  ABSTRACT This paper describes our work in developing multilingual speech recognition and speech synthesis systems in Indian Languages. Existing speech technologies are TTS and ASR in US-Eng, Ind –Eng, Hindi no such systems exist for any other Indian languages. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Introduction:  Introduction Voice enabled services are rapidly growing and high margin opportunity, specifically in multilingual country such as India . It is very difficult to have one speech synthesizer for each language. The focus is also to develop common multilingual corpora with support for multiple Indian languages and to build appropriate language specific linguistic analysis modules for text-to-speech synthesis. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Important issues involved :  Important issues involved Enumerating a phone set to represent Indian languages. Selection of basic unit for synthesis - half-phones, diphones, syllables. Creating a generic acoustic database that covers language variations. Modeling language specific prosody. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Our approaches :  Our approaches A common notation for graphemes is developed using IT-3 transliteration. Di phone based speech synthesis. Data-driven prosody modeling using Classification and Regression Trees (CART). Concatenative synthesis using cluster unit selection techniques with syllable-like units. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Our Current research work:  Our Current research work Text to speech synthesis TTS is a multi lingual Text–To–Speech Engine which would enable speech applications to be built in local Indian languages using unit selection algorithm and large corpus. A Telugu TTS system has been built and a voice portal which reads out the local language news in Telugu has been developed. Speech recognition ASR is a multi lingual automatic speech recognition System that in conjunction with our TTS will enable full fledged speech solutions, the advance features of this engine would allow customization to a vertical within a few hours. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Slide7:  Search engine This is a cross-lingual search engine capable of searching through the content of all Indian languages. This advanced cross-lingual search engine makes use of several novel features of Indian language scripts including phonetic nature, common phonetic base and syllabic structure of Indian languages. The other novelty of this search engine is that it uses phonetic level units for indexing which enable seamless cross-lingual search across the languages. Phonetic typing tool This tool make use of an intuitive and advanced readable transliteration scheme and phonetic properties to key-in scripts in Indian languages. The Bhrigus phonetic typing tool comes with a friendly user interface as well as with APIs to get integrated in applications such as Email, Blogging framework etc. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Slide8:  Font converters There is a chaos as far as the Indian languages in electronic form are concerned. Neither can one exchange the notes in Indian languages as conveniently as in English language, nor can one perform search on texts in Indian languages available over the web. This is so because the texts are being stored in font dependent glyph codes. The glyph coding schemes for these fonts is typically different for different fonts. To view the content of these sites then one requires these fonts on local machine. We are building the font converters for almost all Indian languages. Multi lingual dictionary We are developing a multi lingual dictionary which consists of English as source language and the target languages are Indian languages such as Telugu, Tamil, Gujarathi, Hindi etc. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Bhrigus ASR and TTS Process Framework:  Bhrigus ASR and TTS Process Framework The project components of a TTS system could be divided into language-independent component (LIC) and language-dependant component (LDC). LIC consists of speech synthesis engine dealing with unit selection algorithm and signal processing. LDC deals with building language specific resources such as pronunciation dictionary, unit selection database to build a synthetic voice. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Language Dependant (LDC) and Language independent (LIC) components of a TTS system :  Language Dependant (LDC) and Language independent (LIC) components of a TTS system Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Language Dependant (LDC) and Language Independent components (LIC) of an ASR system:  Language Dependant (LDC) and Language Independent components (LIC) of an ASR system Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Slide12:  The development time for building a TTS and an ASR system should consists of developing LIC components and LDC components. The LIC component of ASR systems is Bhrigus ASR speech recognition-engine, while the LIC component of TTS system is Bhrigus TTS unit-selection-engine. To build LDC components for ASR and TTS, it is suggested to build them together as it would decrease the development time primarily due to sharing of language dependent resources across TTS and ASR systems. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Slide13:  The LDC resources that could be shared across TTS and ASR systems are text data, pronunciation dictionary and letter-to-sound rules. The collected text would be used to build language models for ASR and at the same time would be used to extract a set of optimal sentences to be recorded in the case of TTS system. Similarly pronunciation dictionary and letter-to-sound rules could be shared across the TTS and ASR system. It should also be noted that there exists several modules inside the TTS and ASR engines which could be shared too. Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Demos:  Demos Demos are at http://196.12.38.23/index.html Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Conclusion :  Conclusion Four basic principles are to create and sustain the leading market solution for professional services. text-to-speech, speech-to-text, search, machine translation natural dialogue management for Indian languages including Indian-English; interface that solution into the vast majority of technical environments relevant to these types of applications; provide skilled services; and provide services at differentiated low rates Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco Slide16:  Multilingual & International Speech Applications, SpeechTek West 2007, Hilton San Francisco

Related presentations


Other presentations created by Christian

Email Security
07. 10. 2007
0 views

Email Security

adobe satyam customer preso v5
29. 09. 2007
0 views

adobe satyam customer preso v5

Marui
09. 10. 2007
0 views

Marui

qw6p15500g20r7m
10. 10. 2007
0 views

qw6p15500g20r7m

Russia Country Presentation
11. 10. 2007
0 views

Russia Country Presentation

Lequeux
12. 10. 2007
0 views

Lequeux

Daisuke Hiratsuka
12. 10. 2007
0 views

Daisuke Hiratsuka

Introduction 1
15. 10. 2007
0 views

Introduction 1

vollhardt
15. 10. 2007
0 views

vollhardt

Vodovatov
27. 09. 2007
0 views

Vodovatov

sample lecture
16. 10. 2007
0 views

sample lecture

hinduism
04. 12. 2007
0 views

hinduism

WNV and equines slides
21. 10. 2007
0 views

WNV and equines slides

addis Abeba
23. 10. 2007
0 views

addis Abeba

mexico 2006 Kadri
24. 10. 2007
0 views

mexico 2006 Kadri

Google AdWords Intro
16. 11. 2007
0 views

Google AdWords Intro

Graft Healing and Return to Play
19. 11. 2007
0 views

Graft Healing and Return to Play

Drama Methods And Education
23. 11. 2007
0 views

Drama Methods And Education

M Kar S Taban CMTEA2005
26. 11. 2007
0 views

M Kar S Taban CMTEA2005

fertility
03. 01. 2008
0 views

fertility

Ch26
13. 10. 2007
0 views

Ch26

BNJ UAI 20030808
29. 10. 2007
0 views

BNJ UAI 20030808

MVA
02. 11. 2007
0 views

MVA

ftl
12. 10. 2007
0 views

ftl

Cause Effect
20. 02. 2008
0 views

Cause Effect

functfam
24. 02. 2008
0 views

functfam

RegionalHazeConsulta tion081407
03. 10. 2007
0 views

RegionalHazeConsulta tion081407

78097 55
02. 10. 2007
0 views

78097 55

vitamin and mineral
04. 03. 2008
0 views

vitamin and mineral

Garfield Engaging At Risk
10. 03. 2008
0 views

Garfield Engaging At Risk

ESA DAvanzo NEON Education1
03. 01. 2008
0 views

ESA DAvanzo NEON Education1

metaphor creativity
18. 03. 2008
0 views

metaphor creativity

The USA
10. 04. 2008
0 views

The USA

551 CH15Lecture
13. 04. 2008
0 views

551 CH15Lecture

235 Reallocation
14. 04. 2008
0 views

235 Reallocation

FuturePriceProposal
16. 04. 2008
0 views

FuturePriceProposal

HBPM presentation QCC
17. 04. 2008
0 views

HBPM presentation QCC

Keynes
18. 04. 2008
0 views

Keynes

71170 pp7
22. 04. 2008
0 views

71170 pp7

VisitBelgium
16. 03. 2008
0 views

VisitBelgium

nts present 2007 english
01. 10. 2007
0 views

nts present 2007 english

SanjivShah
28. 04. 2008
0 views

SanjivShah

hou
24. 02. 2008
0 views

hou

harris 43
07. 05. 2008
0 views

harris 43

Gosbeelides
02. 05. 2008
0 views

Gosbeelides

Hellen Gill
02. 05. 2008
0 views

Hellen Gill

keys
10. 10. 2007
0 views

keys

slac tour 2006
15. 10. 2007
0 views

slac tour 2006

Reg8reportAtlanta
17. 10. 2007
0 views

Reg8reportAtlanta

ppp estrategia fundacion arias
22. 10. 2007
0 views

ppp estrategia fundacion arias

IHY Turkey poster
23. 11. 2007
0 views

IHY Turkey poster

BTW2007 web
15. 10. 2007
0 views

BTW2007 web

castro situacion CR
22. 10. 2007
0 views

castro situacion CR

17th Lecture
10. 10. 2007
0 views

17th Lecture

Guy
23. 10. 2007
0 views

Guy

17661
25. 10. 2007
0 views

17661

Insolvency Presentation
19. 10. 2007
0 views

Insolvency Presentation

ARIntro
23. 10. 2007
0 views

ARIntro

tillman
12. 03. 2008
0 views

tillman

RTV 350 Ch5
08. 10. 2007
0 views

RTV 350 Ch5

Franco cavallo
15. 10. 2007
0 views

Franco cavallo

12 1 5
24. 03. 2008
0 views

12 1 5