Speech Recognition of the Korean Vowel 'ㅜ' Based on Time Domain Bulk Indicators

Lee, Jae Won;

doi:10.5626/KTCP.2016.22.11.591

KIISE Transactions on Computing Practices (정보과학회 컴퓨팅의 실제 논문지)

Volume 22 Issue 11
/
Pages.591-600
/
2016
/
2383-6318(pISSN)
/
2383-6326(eISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

DOI QR Code

Speech Recognition of the Korean Vowel 'ㅜ' Based on Time Domain Bulk Indicators

시간 영역 벌크 지표에 기반한 한국어 모음 'ㅜ'의 음성 인식

Lee, Jae Won (Sungshin Women's Univ.)

이재원 (성신여자대학교 IT학부)

Received : 2016.08.01
Accepted : 2016.09.19
Published : 2016.11.15

https://doi.org/10.5626/KTCP.2016.22.11.591 Citation KSCI

⟨ Previous Next ⟩

Abstract

Computing technologies are increasingly applied to most casual human environment networks, as computing technologies are further developed. In addition, the rapidly increasing interest in IoT has led to the wide acceptance of speech recognition as a means of HCI. In this study, we present a novel method for recognizing the Korean vowel 'ㅜ', as a part of a phoneme based Korean speech recognition system. The proposed method involves analyses of bulk indicators calculated in the time domain instead of analysis in the frequency domain, with consequent reduction in the computational cost. Four elementary algorithms for detecting typical waveform patterns of 'ㅜ' using bulk indicators are presented and combined to make final decisions. The experimental results show that the proposed method can achieve 90.1% recognition accuracy, and recognition speed of 0.68 msec per syllable.

네트워크와 컴퓨팅 기술의 발달로 인해 인간이 생활하는 거의 모든 일상 환경에 컴퓨팅 기술의 접목이 증대되고 있다. 또한, 사물 인터넷에 대한 관심이 급속히 증대되면서, 음성 인식은 중요한 HCI 수단으로 자리 잡고 있다. 본 논문은 음소 기반 한국어 음성 인식 시스템의 일부로서, 한국어 모음 'ㅜ'에 대한 새로운 인식 방식을 제안한다. 제안하는 방식은 주파수 영역에서의 분석 대신, 시간 영역에서 계산한 벌크 지표를 분석하여 동작하므로, 계산 비용을 현저히 절감할 수 있다. 벌크 지표를 사용하여 모음 'ㅜ'의 전형적인 파형 패턴들을 탐지하기 위한 네 가지 요소 알고리즘을 제시하며, 이를 결합하여 최종적인 판별을 수행한다. 실험 결과를 통해, 제안하는 방식이 90.1%의 인식 정확도를 달성할 수 있음을 확인하였으며, 인식 속도는 어절 당 0.68 msec이다.

Keywords

Acknowledgement

Supported by : 성신여자대학교

References

KOCCA, Culture Technology(CT) in-depth Report, Nov. 2011.
Y. Y. Seo, J. D. Song, J. H. Lee, "Phoneme Segmentation in Consideration of Speech feature in Korean Speech Recognition," Journal of Korean Society for Internet Information, Vol. 2, No. 1, pp. 31-38, 2001. (in Korean)
M. J. Kim and C. H. Kweon, "An Automatic Segmentation System Based on HMM and Correction Algorithm," Speech Sciences, Vol. 9, No. 4, pp. 265-274, 2002. (in Korean)
J. W. Lee, "Speech Recognition of the Korean Vowel 'ㅐ' Based on Time Domain Sequence Patterns," KIISE Transactions on Computing Practices, Vol. 21, No. 11, pp. 713-720, 2015. (in Korean) https://doi.org/10.5626/KTCP.2015.21.11.713
Y. K. Lee, "Speech Interface Technology and Service Trend under the Smart Phone Environment," Information & Communications Magazine, Vol. 29, No. 4, pp. 3-9, 2012. (in Korean)
F. Brugnara et al., "Automatic segmentation and labeling of speech based on hidden Markov model," Speech Communication, Vol. 12, pp. 357-370, 1993. https://doi.org/10.1016/0167-6393(93)90083-W
J. ZhNF, H. Yu, N. Ma, Z. Li, "The Phoneme Automatic Segmentation Algorithms Study of Tibetan Lhasa Words Continuous Speech Stream," Proc. of the 2nd International Conference On Systems Engineering and Modeling, pp. 578-581, 2013.
G. Kiss, D. Sztaho, K. Vicsi, "Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features," 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom), pp. 579-582, 2013.
R. A. Brietion, B. M. G. Cheetham, M. C. Hall, "A comparison of distance measures for speech segmentation in variable frame rate speech vocoding," Proc. of the IEEE Colloquium, pp. 6/1-6/5, 1990.
D. K. Kim, C. G. Jeong, and H. Jeong, "Hierarchy and Modularity in Time-Delay Neural Networks for Korean Phoneme Recognition using HMM," IEEK, Vol. 16, No. 2, pp. 81-84, 1994.
H. Jung, "Korean Speech Recognition Using Neural Networks," Korean Institute of Information Scientists and Engineers, pp. 63-82, 1993
G. Hinton, L. Deng, D. Yu, G. E. Dahl, "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups," Signal Processing Magazine, IEEE, Vol. 29, No. 6, pp. 82-97, 2012.
J. H. Lee, J. W. Lee, and J. Lee, "Korean Phonemes 'ㅅ', 'ㅈ', 'ㅊ' Recognition based on Sign Distribution Volatility," Communications of the Korean Institute of Information Scientists and Engineers, Vol. 19, pp. 377-382, 2013. (in Korean)
J. W. Lee, "Speech Recognition of Korean Phonemes 'ㅅ', 'ㅈ', 'ㅊ' based on Volatility and Turning Points," KIISE Transactions on Computing Practices, Vol. 20, No. 11, pp. 579-585, 2014. (in Korean) https://doi.org/10.5626/KTCP.2014.20.11.579
S. K. Choi, J. W. Lee, and J. Lee, "Korean Vowel 'ㅏ' Recognition based on Wave Sequence Detection," Proc. of the Digital Contents Society Joint Conference 2013, Vol. 14, pp. 577-579, 2013. (in Korean)
W. Roh and J. Lee, "Implementation of Korean Vowel 'ㅏ' Recognition based on Common Feature Extraction of Waveform Sequence," KIISE Transactions on Computing Practices, Vol. 20, No. 11, pp. 567-572, 2014. (in Korean) https://doi.org/10.5626/KTCP.2014.20.11.567
Y. Lee, "Phoneme Segmentation Using Phoneme Combination and Formant Scaling in Korean," Master Thesis, Department of Computer Engineering, Inha University, Incheon, Korea, 2003. (in Korean)

KIISE Transactions on Computing Practices (정보과학회 컴퓨팅의 실제 논문지)

Speech Recognition of the Korean Vowel 'ㅜ' Based on Time Domain Bulk Indicators

시간 영역 벌크 지표에 기반한 한국어 모음 'ㅜ'의 음성 인식

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)