Browse > Article

Pronunciation Variation Modeling for Korean Point-of-Interest Data Using Prosodic Information  

Kim, Sun-He (서울대학교 인문정보연구소)
Park, Jeon-Gue (한국전자통신연구원)
Na, Min-Soo (서울대학교 인지과학협동과정)
Jeon, Je-Hun (서강대학교 컴퓨터학과)
Chung, Min-Wha (서울대학교 언어학과)
Abstract
This paper examines how the performance of an automatic speech recognizer was improved for Korean Point-of-Interest (POI) data by modeling pronunciation variation using structural prosodic information such as prosodic words and syllable length. First, multiple pronunciation variants are generated using prosodic words given that each POI word can be broken down into prosodic words. And the cross-prosodic-word variations were modeled considering the syllable length of word. A total of 81 experiments were conducted using 9 test sets (3 baseline and 6 proposed) on 9 trained sets (3 baseline, 6 proposed). The results show: (i) the performance was improved when the pronunciation lexica were generated using prosodic words; (ii) the best performance was achieved when the maximum number of variants was constrained to 3 based on the syllable length; and (iii) compared to the baseline word error rate (WER) of 4.63%, a maximum of 8.4% in WER reduction was achieved when both prosodic words and syllable length were considered.
Keywords
Pronunciation modeling; Pronunciation variation; Prosodic word; Syllable; Point-of-Interest data;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. H. Jeon, S. Wee, M, Chung, 'Generating Pronunciation Dictionary by Analyzing Phonological Variations Frequently Found in Spoken Korean,' Proc. of International Conference on Speech Processing, pp. 519-523, 1997
2 S. Kim, J. H., Jeon, M. Na, M. Chung, 'Irregular Pronunciation Detection for Korean Point-of-Interest Data Using Prosodic Word,' 말소리, 제57권, pp. 123-137, 2006
3 김선희, 박전규, 나민수, 전재훈, 정민화, '운율 정보를 이용한 한국어 위치 정보 데이타의 발음 모델링', 제18회 한글 및 한국어 정보처리 학술대회 논문집, pp. 51-56, 2006
4 [Prosody-2001] Prosody in Speech Recognition and Understanding, ISCA Tutorial and Research Workshop (ITRW), Molly Pitcher Inn, Red Bank, NJ, USA, October 22-24, 2001, ISCA Archive, http://www.isca-speech.org/archive/prosody_2001
5 M. Riley,W. Byrne, M. Finke, S. Khudanpur, A. Ljolje, J. McDonough, H. Nock, M. Saraclar, C. Wooters, G. Zavaliagkos, 'Stochastic pronunciation modeling from hand-labeled phonetic corpora,' Speech Communication 29, pp. 209-224, 1999   DOI
6 H. Strik and C. Cucchiarini, 'Modeling Pronunciation Variation for ASR: A Survey of the Literature,' Speech Communication 29, pp. 225-246, 1999   DOI   ScienceOn
7 S. Kim, 'Phonology of Exceptions for Korean Grapheme-to-Phoneme Conversion,' Proc. Interspeech 2004-ICSLP, pp. 1285-1288, 2004
8 J. M. Kessens, M. Wester, H. Strik, 'Improving the performance of Dutch CSR by modeling within word and corss-word pronunciation variation,' Speech Communication 29, pp. 193-207, 1999   DOI   ScienceOn
9 E. Fosler-Lussier, 'Multi-level decision trees for static and dynamic pronunciation models,' Proc. Eurospeech 1999, 1999
10 김선희, 전재훈, 나민수, 정민화, '운율어를 이용한 한국어 위치 정보 데이타의 다중 발음 사전 생성', 제18회 한글 및 한국어 정보처리 학술대회 논문집, pp. 183-188, 2006
11 K. Hirose and K. Iwano, 'Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition,' Proc. IEEE International Conference on Acoustics Speech & Signal Processing, Vol.3 pp. 1763-1766, 2000
12 S. Kim, J. Ahn, S.-H. Kim, Y.-H. Lee, 'A Korean Grapheme-to-Phoneme Conversion System Using Selection Procedure for Exceptions,' Proc. Interspeech 2004-ICSLP, pp. 1905-1908, 2004
13 A. Sethy, S. Narayanan, S. Parthasarthy, 'A Syllable Based Approach for Improved Recognition of Spoken Names,' Proc. ISCA Tutorial and Research Workshop, PMLA, pp. 33-35, 2002
14 J. H. Jeon, S. Cha, M. Chung,, J. Park, 'Automatic Generation of Korean Pronunciation Variants by Multistage Applications of Phonological Rules,' Proc. of the International Conference on Spoken Language Processing, pp. 1943-1946, 1998
15 S.-A. Jun, The Phonetics and Phonology of Korean Prosody: Intonational Phonology and Prosodic Structure, Garland Publishing Inc., New York : NY., 1996
16 J. H. Jeon and M. Chung, 'Automatic Generation of Domain-Dependent Pronunciation Lexicon with Data-Driven Rules and Rule Adaptation,' Proc. Interspeech-2005, pp. 1337-1340, 2005