Browse > Article
http://dx.doi.org/10.21288/resko.2017.11.2.143

Improvement of Speech/Music Classification Based on RNN in EVS Codec for Hearing Aids  

Kang, Sang-Ick (인하대학교 전자공학과)
Lee, Sang Min (인하대학교 전자공학과)
Publication Information
Journal of rehabilitation welfare engineering & assistive technology / v.11, no.2, 2017 , pp. 143-146 More about this Journal
Abstract
In this paper, a novel approach is proposed to improve the performance of speech/music classification using the recurrent neural network (RNN) in the enhanced voice services (EVS) of 3GPP for hearing aids. Feature vectors applied to the RNN are selected from the relevant parameters of the EVS for efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and large speech/music data. The proposed algorithm yields better results compared with the conventional scheme implemented in the EVS.
Keywords
Speech/Music Classification; Recurrent Neural Network (RNN); Enhanced Voice Services (EVS); Hearing Aids;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. LIM and J.-H. CHANG, "Improvement of SVM-Based Speech/Music Classification Using Adaptive Kernel Technique," IEICE TRANSACTIONS on Information and Systems, vol.95, no. 3, pp.888-891, 2012.
2 V .Malenovsky ,T. Vaillancourt, W. Zhe, K. Choo, and V. Atti, "Two-Stage Speech/Music Classifier with Decision Smoothing and Sharpening in the EVS Codec," IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp.5718-5722, 2015.
3 S. Karneback, "Discrimination between speech and music based on a low frequency modulation feature," European Conf. on Seech Comm. and Technology, pp. 1891-1984, 2001.
4 A. P. Dempster, N. M. Laird and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Statiscal Soc., vol. 39, no. 1, pp. 1-38, 1977.
5 Y. Gal and Z. Ghahramani, "Dropout as a bayesian approximation: Representing model uncertainty in deep learning," arXiv preprint arXiv:1506.02142, 2015.
6 W. M. Fisher, G. R. Doddington, and K. M. Goudie-Marshall, "The DARPA speech recognition research database: Specification and status," DARPA Workshop Speech Recognition, pp. 93-99, 1986.
7 J. Saunders, "Real-time discrimination of broadcast speech/music," IEEE Int. Conf. Acoustics, Speech, and Processing, vol. 2, pp. 993996, May 1996.
8 W. Q. Wang, W. Gao, and D. W. Ying, "A fast and robust speech/music discrimination approach," Int. Conf. Information, Communications, and Signal Processing, vol. 3, pp. 1325-1329, 2003.
9 J. H. Song, K. H. Lee, J. H. Chang, J. K. Kim, and N. S. Kim, "Analysis and Improvement of Speech/Music Classification for 3GPP2 SMV Based on GMM," IEEE Signal Process. Lett., vol.15, pp.103-106, 2008.   DOI
10 Y Gao, E Shlomot, and A Benyassine, "The SMV algorithm selected by TIA and 3GPP2 for CDMA applications," IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 2, pp. 709-712, 2001.
11 3GPP2 Spec., Source-Controlled Variable-Rate Multimedia Wideband Speech Codec (VMR-WB), Service Option 62 and 63 for Spread Spectrum Systems, 3GPP2-C.S0052-A, v.1.0, Apr. 2005.
12 3GPP Spec., Codec for Enhanced Voice Services (EVS); Detailed Algorithm Description, TS 26.445, v.12.0.0, 2014.