Browse > Article

Korean Broadcast News Transcription Using Morpheme-based Recognition Units  

Kwon, Oh-Wook (Brain Science Research Center, KAIST)
Alex Waibel (Interactive Systems Laboratories, University of Karlsruhe)
Abstract
Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.
Keywords
Continuous speech recognition; Broadcast news transcription; Language model; Morpheme-based speech recognition;
Citations & Related Records
연도 인용수 순위
  • Reference
1 O. W. Kwon, K. Hwang, J. Park, 'Korean large vocabulary continuous speech recognition using pseudomorpheme units,' Proc. EUROSPEECH'99, Budapest, Hungary, Sept. 1999
2 P. Scheytt, P. Geutner, A. Waibel, 'Serbo-Crotian LVCSR on the dictation and broadcasting news domain,' Proc. ICASSP'98, Seattle, USA, May 1998
3 M. K. Ravishankar, Efficient Algorithms for Speech Recog-nition, Ph. D dissert., School of Computer Science, Carnegie Mellon Univ., 1996
4 J. S. Garofolo, J. G. Fiscus, W. M. Fisher, 'Design and preparation of the 1996 HUB-4 broadcast news benchmark test corpora,' Proc. 1997 DARPA Speech Recognition Workshop, Feb. 1997
5 D. S. Pallett, J. Fiscus, M. Przybocki, 'Broadcast News 1999 Test Results,' Proc. 2000 DARPA Speech Transcription Workshop, May, 2000
6 P. Geutner, 'Using morphology towards better large-vocabulary speech recognition systems,' Proc. ICASSP'95, Detroit, USA, May 1995
7 D. S. Pallet, J. G. Fiscus, J. S. Garofolo, A. Martin, M. A. Przybocki, '1998 Broadcast News Benchmark Test Results,' Proc. 1999 DARPA Broadcast News Workshop, Feb. 1999
8 L. M. Tomokiyo, K. Ries, 'An automatic method for learning a Japanese lexicon for recognition of spontaneous speech,' Proc. ICASSP'98, Seattle, USA, May 1998
9 H. K. J. Kuo, W. Reichl, "Phrased-based language models for speech recognition," EUROSPEECH'99, Budapest, Hungary, Sept. 1999
10 H. J. Yu, H. Kim, J. M. Hong, M. S. Kim, J. S. Lee, 'Large vocabulary Korean continuous speech recognition using a one-pass algorithm,' Proc. ICSLP 2000, Oct. 2000
11 R. Bakis, S. Chen, P. Gopalakrishnan, R. Gopinath, S. Maes, L. Polymenakos, and M. Franz, 'Transcription of Broadcast News Shows with the IBM Large Vocabulary Speech Recognition System,' Proc. 1997 DARPA Speech Recognition Workshop, Feb. 1997
12 D. Kiecza, T. Schultz, A. Waibel, 'Data-driven determination of appropriate dictionary units for Korean LVCSR,' Proc. international Conference on Speech Processing (ICSP'99), pp. 323-327, Aug. 1999
13 J. L. Gauvain.L. Lamel, G. Adda, M. Jardino, 'The LIMSI 1998 HUB-4E Transcription system,' Proc. DARPA Broadcast News Transcription, Feb. 1999
14 D. S. Pallet, 'Overview of the 1997 DARPA speech recog-nition workshop,' Proc. 1997 DARPA Speech Recognition Workshop, Feb. 1997
15 L. M. Tomokiyo, K. Ries, 'An automatic method for learning a Japanese lexicon for recognition of spontaneous speech,' ICASSP'98, Seattle, USA, May 1998
16 K. Ries, F. D. Buo, A. Waibel, 'Class phrase models for language modeling,' ICSLP'96, Philadelphia, USA, Oct 1996
17 S. M. Katz, 'Estimation of probabilities from sparse data for the language model component of a speech recognizer,' IEEE Trans. Acousiics, Speech, and Signal Processing, vol. 35, pp. 400-401, 1987   DOI
18 H. J. Yu, H. Kim, J. S. Choi, J. M. Hong, K. S. Park, J. S. Lee, H. Y. Lee, 'Automatic recognition of Korean broadcast news speech,' Proc. ICSLP'98, Sydney, Australia, Dec. 1998
19 M. Finke, P. Geutner, H. Hild, T. Kemp, K. Ries, M. Westphal, 'The KarIsruhe-VerbmobiI speech recognition engine,' Proc. ICASSP'97, Munich, Germany, 1997
20 K. Ohtsuki, S. Furui, N. Sakurai, A. Iwasaki, Z. P. Zhang, 'Improvements in Japanese Broadcast News Transcription,' Proc. 1999 DARPA Broadcast News Transcription, Feb. 1999
21 J. H. Kim, Lexical Disambiguation with Error-Driven Learning, Ph. D. dissert. Dept. Computer Science, Korea Advanced Institute of Science and Technology, 1996
22 K. Ohtshuki, T. Matsuoka, T. Mori, K. Yoshida, Y. Taguchi, S. Furui, K. Shirai, 'Japanese large-vocabulary continuous-speech recognition using a newspaper corpus and broadcast news,' Speech Communication 28, pp. 155-166, 1999   DOI   ScienceOn
23 O. W. Kwon, 'Performance of LVCSR with morpheme-based and syllable-based recognition units,' Proc. ICASSP 2000, pp. 1567-1570, June 2000
24 P. Beyerlein, X. Aubert, R. Haeb-Umbach, M. Harris, Dietrich Klakow, A. Wendemuth, Sirko Molau, Michael Pitz, A. Sixtus, 'The Philips/RWTH System for Transcription of Broadcast News,' Proc. DARPA Broadcast News Transcrip-tion, Feb. 1999
25 G. S. Lee, A. Waibel, 'Korean broadcast news speech recognition using HMM,' Proc. international Conference on Speech Processing (ICSP'99), Aug. 1999
26 J. Jeon, S. Cha, M. Chung, J. Park, K. Hwang, 'Automatic generation of Korean pronunciation variants by multistage applications of phonological rules,' Proc. ICSLP'98, Sydney, Australia, Dec. 1998
27 P. Clarkson, R. Rosenfeld, 'Statistical language modeling using the CMU-Cambridge toolkit,' Proc. EUROSPEECH'97, pp. 2707-2710, 1997