[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.6109/jkiice.2016.20.3.471

Speech Recognition Accuracy Prediction Using Speech Quality Measure

Ji, Seung-eun (Department of Computer Science & Engineering, Incheon National University)
Kim, Wooil (Department of Computer Science & Engineering, Incheon National University)

Publication Information

Journal of the Korea Institute of Information and Communication Engineering / v.20, no.3, 2016 , pp. 471-476 More about this Journal

Abstract

This paper presents our study on speech recognition performance prediction. Our initial study shows that a combination of speech quality measures effectively improves correlation with Word Error Rate (WER) compared to each speech measure alone. In this paper we demonstrate a new combination of various types of speech quality measures shows more significantly improves correlation with WER compared to the speech measure combination of our initial study. In our study, SNR, PESQ, acoustic model score, and MFCC distance are used as the speech quality measures. This paper also presents our speech database verification system for speech recognition employing the speech measures. We develop a WER prediction system using Gaussian mixture model and the speech quality measures as a feature vector. The experimental results show the proposed system is highly effective at predicting WER in a low SNR condition of speech babble and car noise environments.

Keywords

Word error rate; Correlation coefficient; Performance prediction; Speech recognition; Speech quality measure;

Citations & Related Records

Reference

1	S. -Y. Yoon, L. Chen and K. Zechner, "Predicting Word Accuracy for the Automatic Speech Recognition of Non-native Speech," Interspeech-2010, pp. 773-776, 2010.
2	W. Kim and J. H. L. Hansen, "Phonetic Distance Based Confidence Measure," Signal Processing Letters, IEEE vol. 17, no.2, pp. 121-124, Feb. 2010. DOI
3	S. Ji and W. Kim, "A Study on Speech Measure Analysis for Speech Recognition Accuracy Estimation in Noisy Environments," A Conference of Acoustical Society of Korea, vol. 34, no. 1, pp. 46, May 2015.
4	S. Ji, J. Cho and W. Kim, "Development of Database Verification System for Automatic Speech Recognition," KCC2015, vol. 34, pp. 719-720, June 2015.
5	S. Ji and W. Kim, "A Study on Effective Speech Recognition Performance Measure using MFCC Similarity," KSCSP-2015, vol. 32, no. 1, pp.220-222, Aug. 2015.
6	S. Ji, M. Song, J. Yoon and W. Kim, "Speech Recognition Performance Prediction employing Speech Quality Measure," A Conference of Acoustical Society of Korea, vol. 34, no. 2, pp. 46, Nov. 2015.
7	STNR technique provided by National Institute of Standards and Technology(NIST) [Internet]. Available: http://www.nist.gov/speech
8	Y. Hu and P. C. Loizou, "Evaluation of Objective Measure for Speech Enhancement," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 16, no.1, pp. 229-238, Sep. 2008. DOI
9	Hidden Markov Model Toolkit (HTK) developed by Cambridge University. HTK software and tutorial download page [Internet]. Available: http://htk.eng.cam.ac.uk
10	TIMIT speech database provided by Linguistic Data Consortium(LDC) of University of Pennsylvania [Internet]. Available: https://catalog.ldc.upenn.edu/LDC93S1

12	(2016) 한국정보통신학회논문지 효과적인 음성 인식 평가를 위한 심층 신경망 기반의 음성 인식 성능 지표 / 21 (12) , 2291
5	(2016) 한국정보통신학회논문지 심한 소음환경에서 언어장애인 음성 인식률 향상을 위한 단어선정 방법 및 장치 개선에 관한 연구 / 23 (5) , 555
21	(2016) Multimedia tools and applications Adaptive recognition of different accents conversations based on convolutional neural network / 78 (21) , 30749
11	(2019) 한국정보통신학회논문지 신경학적 손상에 의한 언어장애인 음성 인식률 개선(H/W, S/W)에 관한 연구 / 23 (11) , 1397

KSCI

Speech Recognition Accuracy Prediction Using Speech Quality Measure 음성 특성 지표를 이용한 음성 인식 성능 예측

Speech Recognition Accuracy Prediction Using Speech Quality Measure