[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.21288/resko.2017.11.2.143

Improvement of Speech/Music Classification Based on RNN in EVS Codec for Hearing Aids

Kang, Sang-Ick (인하대학교 전자공학과)
Lee, Sang Min (인하대학교 전자공학과)

Publication Information

Journal of rehabilitation welfare engineering & assistive technology / v.11, no.2, 2017 , pp. 143-146 More about this Journal

Abstract

In this paper, a novel approach is proposed to improve the performance of speech/music classification using the recurrent neural network (RNN) in the enhanced voice services (EVS) of 3GPP for hearing aids. Feature vectors applied to the RNN are selected from the relevant parameters of the EVS for efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and large speech/music data. The proposed algorithm yields better results compared with the conventional scheme implemented in the EVS.

Keywords

Speech/Music Classification; Recurrent Neural Network (RNN); Enhanced Voice Services (EVS); Hearing Aids;

Citations & Related Records

Reference

1	C. LIM and J.-H. CHANG, "Improvement of SVM-Based Speech/Music Classification Using Adaptive Kernel Technique," IEICE TRANSACTIONS on Information and Systems, vol.95, no. 3, pp.888-891, 2012.
2	V .Malenovsky ,T. Vaillancourt, W. Zhe, K. Choo, and V. Atti, "Two-Stage Speech/Music Classifier with Decision Smoothing and Sharpening in the EVS Codec," IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp.5718-5722, 2015.
3	S. Karneback, "Discrimination between speech and music based on a low frequency modulation feature," European Conf. on Seech Comm. and Technology, pp. 1891-1984, 2001.
4	A. P. Dempster, N. M. Laird and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Statiscal Soc., vol. 39, no. 1, pp. 1-38, 1977.
5	Y. Gal and Z. Ghahramani, "Dropout as a bayesian approximation: Representing model uncertainty in deep learning," arXiv preprint arXiv:1506.02142, 2015.
6	W. M. Fisher, G. R. Doddington, and K. M. Goudie-Marshall, "The DARPA speech recognition research database: Specification and status," DARPA Workshop Speech Recognition, pp. 93-99, 1986.
7	J. Saunders, "Real-time discrimination of broadcast speech/music," IEEE Int. Conf. Acoustics, Speech, and Processing, vol. 2, pp. 993996, May 1996.
8	W. Q. Wang, W. Gao, and D. W. Ying, "A fast and robust speech/music discrimination approach," Int. Conf. Information, Communications, and Signal Processing, vol. 3, pp. 1325-1329, 2003.
9	J. H. Song, K. H. Lee, J. H. Chang, J. K. Kim, and N. S. Kim, "Analysis and Improvement of Speech/Music Classification for 3GPP2 SMV Based on GMM," IEEE Signal Process. Lett., vol.15, pp.103-106, 2008. DOI
10	Y Gao, E Shlomot, and A Benyassine, "The SMV algorithm selected by TIA and 3GPP2 for CDMA applications," IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 2, pp. 709-712, 2001.
11	3GPP2 Spec., Source-Controlled Variable-Rate Multimedia Wideband Speech Codec (VMR-WB), Service Option 62 and 63 for Spread Spectrum Systems, 3GPP2-C.S0052-A, v.1.0, Apr. 2005.
12	3GPP Spec., Codec for Enhanced Voice Services (EVS); Detailed Algorithm Description, TS 26.445, v.12.0.0, 2014.

KSCI

Improvement of Speech/Music Classification Based on RNN in EVS Codec for Hearing Aids EVS 코덱에서 보청기를 위한 RNN 기반의 음성/음악 분류 성능 향상

Improvement of Speech/Music Classification Based on RNN in EVS Codec for Hearing Aids