DOI QR코드

DOI QR Code

Speech Recognition of Korean Phonemes 'ㅅ', 'ㅈ', 'ㅊ' based on Volatility and Turning Points

변동성과 전환점에 기반한 한국어 음소 'ㅅ', 'ㅈ', 'ㅊ' 음성 인식

  • Received : 2014.08.04
  • Accepted : 2014.10.01
  • Published : 2014.11.15

Abstract

A phoneme is the minimal unit of speech, and it plays a very important role in speech recognition. This paper proposes a novel method that can be used to recognize 'ㅅ', 'ㅈ', and 'ㅊ' among Korean phonemes. The proposed method is based on a volatility indicator and a turning point indicator that are calculated for each constituting block of the input speech signal. The volatility indicator is the sum of the differences between the values of each two samples adjacent in a block, and the turning point indicator is the number of extremal points at which the direction of the increment or decrement of the values of the sample are inverted in a block. A phoneme recognition algorithm combines the two indicators to finally determine the positions at which the three target phonemes mentioned above are recognized by utilizing optimized thresholds related with those indicators. The experimental results show that the proposed method can markedly reduce the error rate of the existing methods both in terms of the false reject rate and the false accept rate.

음소는 음성을 구성하는 최소 단위로서 음성을 인식하는데 있어 매우 중요한 역할을 한다. 본 논문은 음소 기반 한국어 음성 인식의 일부로서, 한국어 음소 중 'ㅅ', 'ㅈ', 'ㅊ'에 대한 새로운 인식 방안을 제안한다. 제안하는 방식은 입력 음성 신호를 구성하는 각각의 블록에 대해 계산되는 변동성 지표와 전환점 지표에 기반한다. 변동성 지표는 블록 내의 인접한 샘플 값들의 차이의 총합이며, 전환점 지표는 블록 내에서 샘플 값의 증가와 감소의 방향이 전환되는 극점의 총수이다. 두 지표를 결합하여 음소 인식을 수행하는 인식 알고리즘은 두 지표와 관련하여 최적화된 임계치들을 활용하여 목표로 하는 세 가지 음소가 인식된 위치를 최종적으로 결정한다. 실험 결과를 통해, 제안하는 방식을 사용함으로써 기존의 방식들에 비해 FRR과 FAR의 관점에서 모두 오류율을 현저히 감소시킬 수 있음을 확인하였다.

Keywords

References

  1. Y. J. Kim, H. L. Kim, and J. H. Jung, "A Study on the Korean Syllable As Recognition Unit," Journal of Acoustical Society of Korea, Vol. 16, No. 3, pp. 64-72, 1997. (in Korean)
  2. Y. K. Lee, "Speech Interface Technology and Service Trend under the Smart Phone Environment," Information & Communications Magazine, Vol. 29, No. 4, pp. 3-9, 2012. (in Korean)
  3. S. J. Oh and G. D. Kim, "Performance Evaluation of HM-Net Speech Recognition System using Korea Large Vocabulary Speech DB," Journal of KSSS : Speech Sciences, Vol. 6, No. 1, pp.160-176, 1999. (in Korean)
  4. G. M. Choi, D. C. Lim, and H. S. Lee, "A Study on Korean Digit Recognition by Using Phoneme Boundary Information," Proc. of the Acoustical Society of Korea Conference 2001, Vol. 20, No. 2, pp. 117-120, 2001. (in Korean)
  5. B. S. Kim and S. H. Kim, "A Study on Speech Recognition based on Phoneme for Korean Subway Station Names," Journal of the korean society for railway, Vol. 14, No. 3, pp. 228-233, 2011. (in Korean) https://doi.org/10.7782/JKSR.2011.14.3.228
  6. M. Y. Nam, J. J. Lee, J. H. Park, and S. Y. No, "Recognition of Korean Fricatives and Affricates Using Modified Teager Energy Measurement Method," Proc. of the IEEK Conference 1993, Vol. 15, No. 1, pp. 23-26, 1993. (in Korean)
  7. H. S. Baek, S. H. Cho, and D. S. Yook, "Connected Korean Digit Speech Recognition Using Syllablebased Recognition Units," Proc. of the KMMS Conference 2010, pp. 514-515, 2010. (in Korean)
  8. S. K. Lee and J. Y. Song, "A Study on the Spectrum Variation of Korean Speech," Journal of KSII, Vol. 6, No. 6, pp. 179-186, 2005. (in Korean)
  9. K. W. Kim, K. Y. Lee, C. S. Bea, and K. S. Choi, "A Study on the Phonemic Segmentation of an Initial Affricate," Proc. of the KIEE Conference 1988, pp. 33-36, 1988. (in Korean)
  10. Y. W. Seo, S. J. Han, H. J. Jang, and J. H. Lee, "Branch Algorithm for Phoneme Segmentation in Korean Speech Recognition System," Proc. of the KIISE Fall Conference 2000, Vol. 27, No. 1, pp. 357-359, 2000. (in Korean)
  11. J. H. Lee, J. W. Lee, and J. Lee, "Korean Affricate Recognition based on Sign Distribution," Proc. of the KIISE Fall Conference 2012, Vol. 39, No. 2(B), pp. 70-72, 2012. (in Korean)

Cited by

  1. Speech Recognition of the Korean Vowel 'ㅐ', Based on Time Domain Sequence Patterns vol.21, pp.11, 2015, https://doi.org/10.5626/KTCP.2015.21.11.713
  2. Speech Recognition for the Korean Vowel 'ㅣ' based on Waveform-feature Extraction and Neural-network Learning vol.22, pp.2, 2016, https://doi.org/10.5626/KTCP.2016.22.2.69