Browse > Article

Style-Specific Language Model Adaptation using TF*IDF Similarity for Korean Conversational Speech Recognition  

Park, Young-Hee (Department of Computer Science, Sogang University)
Chung, Min-Hwa (Department of Computer Science, Sogang University)
Abstract
In this paper, we propose a style-specific language model adaptation scheme using n-gram based tf*idf similarity for Korean spontaneous speech recognition. Korean spontaneous speech shows especially different style-specific characteristics such as filled pauses, word omission, and contraction, which are related to function words and depend on preceding or following words. To reflect these style-specific characteristics and overcome insufficient data for training language model, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to their n-. gram based tf*idf similarity, in which in-domain language model include disfluency model. Recognition results show that n-gram based tf*idf similarity weighting effectively reflects style difference.
Keywords
Korean conversational speech recognition; Language model adaptation; Disfluencies; Filled pauses.;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Y.-H. Park and M. Chung, 'Analysis of Korean spontaneous speech characteristics for spoken dialogue recognition,' Journal of The Acoustical Society of Korea, vol 21, no 3, pp.330-338, 2002
2 M. Mahajan, D. Beeferman and X. D. Huang, 'Improved topicdependent language modeling using information retrieval techniques,'Proc. ICASSP, vol 1, pp. 541-544, 1999
3 D.-H. Ahn and M. Chung, 'Compact subnetwork based large vocabulary continuous speech recognition,' Proc. ICSLP, vol, 1, pp.25-728, 2002
4 A. Stolcke and E. Shriberg, 'Statistical language modeling for speech disfluencies,' Proc. ICASSP, vol, 1, pp,405-408, 1996
5 M. Siu and M. Ostendorf, 'Modeling disfluencies in conversational speech,' Proc. ICSLP, vol 1, pp.621-625, 1996
6 R. Iyer and M. Ostendorf, 'Relevance weighting for combining multidomain data for Ngram language modeling,' Computer Speech and Language, Vol. 13, pp. 267-282, 1999   DOI   ScienceOn