Browse > Article
http://dx.doi.org/10.5626/KTCP.2017.23.11.617

Speech Recognition of the Korean Vowel 'ㅡ' based on Neural Network Learning of Bulk Indicators  

Lee, Jae Won (Sungshin Women's Univ.)
Publication Information
KIISE Transactions on Computing Practices / v.23, no.11, 2017 , pp. 617-624 More about this Journal
Abstract
Speech recognition is now one of the most widely used technologies in HCI. Many applications where speech recognition may be used (such as home automation, automatic speech translation, and car navigation) are now under active development. In addition, the demand for speech recognition systems in mobile environments is rapidly increasing. This paper is intended to present a method for instant recognition of the Korean vowel 'ㅡ', as a part of a Korean speech recognition system. The proposed method uses bulk indicators (which are calculated in the time domain) instead of the frequency domain and consequently, the computational cost for the recognition can be reduced. The bulk indicators representing predominant sequence patterns of the vowel 'ㅡ' are learned by neural networks and final recognition decisions are made by those trained neural networks. The results of the experiment show that the proposed method can achieve 88.7% recognition accuracy, and recognition speed of 0.74 msec per syllable.
Keywords
speech recognition; vowel; bulk indicator; neural network;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 G. Kiss, D. Sztaho, K. Vicsi, "Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features," 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 579-582, 2013.
2 R. A. Brietion, B. M. G. Cheetham, M. C. Hall, "A comparison of distance measures for speech segmentation in variable frame rate speech vocoding," Proceeding of the IEEE Colloquium, pp. 6/1-6/5, 1990.
3 F. Itakura and S. Saito, "A statistical method for estimation of speech spectral density and formant frequencies," Electronics and Communications in Japan, Vol. 53A, pp. 36-43, 1970.
4 J. H. Lee, J. W. Lee, and J. Lee, "Korean Phonemes 'ㅅ', 'ㅈ', 'ㅊ' Recognition based on Sign Distribution Volatility," Communications of the Korean Institute of Information Scientists and Engineers, Vol. 19, pp. 377-382, 2013. (in Korean)
5 J. W. Lee, "Speech Recognition of Korean Phonemes 'ㅅ', 'ㅈ', 'ㅊ' based on Volatility and Turning Points," KIISE Transactions on Computing Practices, Vol. 20, No. 11, pp. 579-585, 2014. (in Korean)   DOI
6 W. Roh and J. Lee, "Implementation of Korean Vowel 'ㅏ' Recognition based on Common Feature Extraction of Waveform Sequence," KIISE Transactions on Computing Practices, Vol. 20, No. 11, pp. 567-572, 2014. (in Korean)   DOI
7 W. Roh and J. Lee, "Implementation of Waveform Sequence Feature Extraction for Korean Vowel 'ㅓ' Recognition," KCC2015, pp. 128-130, 2014. (in Korean)
8 J. W. Lee, "Speech Recognition of the Korean Vowel 'ㅐ' Based on Time Domain Sequence Patterns," KIISE Transactions on Computing Practices, Vol. 21, No. 11, pp. 713-720, 2015. (in Korean)   DOI
9 KOCCA, Culture Technology(CT) in-depth Report, Nov. 2011.
10 Y. Y. Seo, J. D. Song, J. H. Lee, "Phoneme Segmentation in Consideration of Speech feature in Korean Speech Recognition," Journal of Korean Society for Internet Information, Vol. 2, No. 1, pp. 31-38, 2001. (in Korean)
11 M. J. Kim and C. H. Kweon, "An Automatic Segmentation System Based on HMM and Correction Algorithm," Speech Sciences, Vol. 9, No. 4, pp. 265-274, 2002. (in Korean)
12 G. Hinton, L. Deng, D. Yu, G. E. Dahl, "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups," Signal Processing Magazine, IEEE, Vol. 29, No. 6, pp. 82-97, 2012.
13 J. W. Lee, "Speech Recognition of the Korean Vowel 'ㅗ' Based on Time Domain Waveform Patterns," KIISE Transactions on Computing Practices, Vol. 22, No. 11, pp. 583-590, 2016. (in Korean)   DOI
14 J. W. Lee, "Speech Recognition of the Korean Vowel 'ㅜ' Based on Time Domain Bulk Indicators," KIISE Transactions on Computing Practices, Vol. 22, No. 11, pp. 591-600, 2016. (in Korean)   DOI
15 H. Jung, Korean Speech Recognition Using Neural Networks, Korean Institute of Information Scientists and Engineers, pp. 63-82, 1993.
16 Y. Lee, "Phoneme Segmentation Using Phoneme Combination and Formant Scaling in Korean," Master Thesis, Department of Computer Engineering, Inha University, Incheon, Korea, 2003. (in Korean)
17 F. Brugnara et al., "Automatic segmentation and labeling of speech based on hidden Markov model," Speech Communication, Vol. 12, pp. 357-370, 1993.   DOI
18 J. ZhNF, H. Yu, N. Ma, Z. Li, "The Phoneme Automatic Segmentation Algorithms Study of Tibetan Lhasa Words Continuous Speech Stream," Proc. of the 2nd International Conference On Systems Engineering and Modeling, pp. 578-581, 2013.
19 D. K. Kim, C. G. Jeong, H. Jeong, "Hierarchy and Modularity in Time-Delay Neural Networks for Korean Phoneme Recognition using HMM," IEEK, Vol. 16, No 1, pp. 81-84, 1994.