DOI QR코드

DOI QR Code

A Land and Maritime Unified Tourism Information Guide System Based on Robust Speech Recognition in Ship Noise Environments

선박 잡음 환경에서의 강건한 음성 인식 기반 육해상 통합 관광 정보 안내 시스템

  • 전광명 (광주과학기술원 정보통신공학부 휴먼미디어통신 및 처리연구실) ;
  • 이장원 (광주과학기술원 정보통신공학부 휴먼미디어통신 및 처리연구실) ;
  • 박지훈 (광주과학기술원 정보통신공학부 휴먼미디어통신 및 처리연구실) ;
  • 이성로 (목포대학교) ;
  • 이연우 (목포대학교) ;
  • 맹세영 (목포대학교) ;
  • 김홍국 (광주과학기술원 정보통신공학부 휴먼미디어통신 및 처리연구실)
  • Received : 2013.01.14
  • Accepted : 2013.02.19
  • Published : 2013.02.28

Abstract

In this paper, a land and maritime unified tourism information guide system is proposed which employs robust speech recognition in ship noise environments. Most of conventional front-ends for speech recognition have used a Wiener filter to compensate for stationary noise such as car or babble noises. However, such the conventional front-ends have limitation in reducing non-stationary noise that are occurred inside the ship on voyage. To overcome such a limitation, the proposed system incorporates nonlinear multi-band spectral subtraction to provide highly accurate tourism route recognition. It is shown from the experiment that compared to a conventional system the proposed system achieves relative improvement of a tourism route recognition rate by 5.54% under a noise condition of 10 dB signal-to-noise ratio (SNR).

본 논문에서는 선박에서의 잡음 환경에 강건한 음성인식 기술을 포함하는 육해상 통합관광정보 안내 시스템을 제안한다. 대부분의 음성인식 전처리부는 차량, 배블(babble) 잡음 등의 정상특성 잡음 제거하기 위해 위너(Wiener) 필터를 이용해 왔다. 하지만 이러한 기존의 전처리부는 항해중인 선박 내에서 발생하는 비정상 잡음을 제거하는데 한계가 있다. 이러한 한계를 극복하기 위해 제안하는 시스템은 높은 관광 경로 인식 정확성을 얻기 위해 비선형 다중밴드 스펙트럴 차감법(multi-band spectral subtraction)을 적용한다. 실험 결과 제안된 시스템은 기존 대비 10 dB 신호대잡음비의 잡음 환경에서 평균 5.54%의 경로명 인식률 개선을 보였다.

Keywords

References

  1. J. W. Hong, "A vision and development strategy of maritime leisure industry," Korea Tourism Policy, vol. 32, no. 1, pp. 26-33, June 2008.
  2. D. S. Kim, K. M. Jeon, J. H. Park, W. K. Seong, S. R. Lee, H. K. Kim, "A voyage plan and tour/environmental information guide system based on voice user interface," in Proc. KICS Summer Conf. 2011, p. 13, Jeju Island, Korea, June 2011.
  3. K. M. Jeon, W. K. Seong, J. H. Park, H. K. Kim, "A Land and Maritime Unified Tourism Information Guide System Based on Voice User Interface," in Proc. Korea Soc. of Speech Sciences Autumn Conf. 2012, pp. 279-281, Seoul, Korea, Dec. 2012.
  4. D. S. Kim, K. M. Jeon, J. H. Park, W. K. Seong, H. K. Kim, and S. R. Lee, "Client/server-based cultural tourist guide system using voice user interface," in Proc. Int. Conf. on Computer and Applications (CCA), p. 128, Seoul, Korea, Mar. 2012.
  5. R. J. McAulay and M. L. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Trans. on Acoust., Speech, and Signal Processing (ASSP), vol. 28, no. 2, pp. 137-145, Apr. 1980. https://doi.org/10.1109/TASSP.1980.1163394
  6. K. M. Jeon, N. I. Park, H. K. Kim, M. K.Choi, L. C. Hwang, and S. R. Kim, "MDCT-domain noise reduction with block switching for the application to MPEG audio coding," in Proc. Int. Conf. on Advanced Signal Processing (ASP), Seoul, Korea, p. 98, Mar. 2012.
  7. ETSI ES 202 050, Speech Processing: Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Feature Extraction Algorithm, 2002.
  8. J. F. Kaiser, "On a simple algorithm to calculate the 'energy' of a signal," in Proc. IEEE Int. Conf. on Acoust., Speech, and Signal Processing 1990 (ICASSP 1990), Albuquerque, NM, pp. 381-384, Apr. 1990.
  9. D. E. Comer, Internetworking with TCP/IP: Principles, Protocols, and Architecture, Prentice Hall, 1991.
  10. I. Y. Soon, S. N. Koh, and C. K. Yeo, "Noisy speech enhancement using discrete cosine transform," Speech Communication, vol. 24, no. 3, pp. 249-257, June 1998. https://doi.org/10.1016/S0167-6393(98)00019-3