[KSCI] Korea Science Citation Index Service

Analysis of Korean Spontaneous Speech Characteristics for Spoken Dialogue Recognition

박영희 (서강대학교 컴퓨터학과 음성언어처리연구실)
정민화 (서강대학교 컴퓨터학과 음성언어처리연구실)

Publication Information

The Journal of the Acoustical Society of Korea / v.21, no.3, 2002 , pp. 330-338 More about this Journal

Abstract

Spontaneous speech is ungrammatical as well as serious phonological variations, which make recognition extremely difficult, compared with read speech. In this paper, for conversational speech recognition, we analyze the transcriptions of the real conversational speech, and then classify the characteristics of conversational speech in the speech recognition aspect. Reflecting these features, we obtain the baseline system for conversational speech recognition. The classification consists of long duration of silence, disfluencies and phonological variations; each of them is classified with similar features. To deal with these characteristics, first, we update silence model and append a filled pause model, a garbage model; second, we append multiple phonetic transcriptions to lexicon for most frequent phonological variations. In our experiments, our baseline morpheme error rate (WER) is 31.65%; we obtain MER reductions such as 2.08% for silence and garbage model, 0.73% for filled pause model, and 0.73% for phonological variations. Finally, we obtain 27.92% MER for conversational speech recognition, which will be used as a baseline for further study.

Keywords

Conversational speech recognition; Spontaneous speech recognition; Disfluencies; Noise; pronunciation variations; Filled pauses; Garbage model;

Citations & Related Records

Reference

1	개념 및 구문정보를 이용한 한국어 대화체 분석 시스템 / [ 왕지현;서영훈 ] / 제9회 한글 및 한국어 정보처리 학술발표 논문집
2	한국어 대화체 인식 시스템의 구현 / [ 이항섭;박준;권오욱 ] / 제13회 음성통신 및 신호처리 워크샵
3	Preliminaries to a Theory of Speech Disfluencies / [ E. Shriberg ] / Ph. D. thesis
4	Disfluencies in switchboard / [ E. Shriberg ] / Proc. of Inter-national Conference on Spoken Language Processing
5	Statistical language modeling for speech disfluencies / [ A. Stolcke;E. Shriberg ] / Proc. of International Conference on Acoustics, Speech, and Signal
6	Effect of speaking style on LVCSR performance / [ M. Weintraub;K. Taussing;K.H.;A. Snodgrass ] / Proc. of Inter-national Conference on Spoken Language Processing
7	Error analysis and disfluencies modeling in the Switchboard domain / [ R. Rosenfeld;R. Agarwal;R. lyer;L Shriberg;D. Vergyri ] / JHU Summer Workshop
8	Modeling disfluencies in conversational speech / [ M.H. Siu;M. Ostendorf ] / Proc. of International Conference on Spoken Language Processing
9	SWITCHBOARD;Telephone speech corpus for research and development / [ J.J. Godfrey;E.C. Holliman;J. McDaniel ] / Proc. of International Conference on Acoustics, Speech, and Signal
10	The SRI march 2000 HUB-5 conversational speech transcription system / [ A. Stolcke;H. Bratt;J. Butzberger;H. Franco;V.R. Rao Graoble;M. Plauche;C. Richey;E. Shriberg;K. Sonmez;F. Weng;J. Zheng ] / Proc. of NIST Speech Transcription Workshop
11	Pronunciation modeling using a Hand-labelled corpus for conversational speech recognition / [ B. Byrne;M. Finke;S. Khudanpur;J. McDounugh;H. Nock;M. Riley;M. Saraclar;C. Wooters;G. Zavaliagkos ] / Proc. of International Conference on Acoustics, Speeech, and Signal
12	Word Predictability after hesitations;A corpus-based study / [ E. Shriberg;A. Stolcke ] / Proc. of International Conference on Spoken Language Processing
13	Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition / [ M. Finke;A. Waibel ] / Proc. of EUROSPEECH
14	/ [] / HTK Hidden Markov Model Tookit, Version 2.2
15	한국어 낭독체 인식의 발성 잡음처리를 위한 Human Garbage 모델링 / [ 이경님;정민화 ] / 한국음향학회 하계학술대회논문집

KSCI

Analysis of Korean Spontaneous Speech Characteristics for Spoken Dialogue Recognition 대화체 연속음성 인식을 위한 한국어 대화음성 특성 분석

Analysis of Korean Spontaneous Speech Characteristics for Spoken Dialogue Recognition