Browse > Article
http://dx.doi.org/10.5573/ieek.2013.50.2.185

Speech/Mixed Content Signal Classification Based on GMM Using MFCC  

Kim, Ji-Eun (Department of Radio Engineering, ChungBuk University)
Lee, In-Sung (Department of Radio Engineering, ChungBuk University)
Publication Information
Journal of the Institute of Electronics and Information Engineers / v.50, no.2, 2013 , pp. 185-192 More about this Journal
Abstract
In this paper, proposed to improve the performance of speech and mixed content signal classification using MFCC based on GMM probability model used for the MPEG USAC(Unified Speech and Audio Coding) standard. For effective pattern recognition, the Gaussian mixture model (GMM) probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM) algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and mixed content signals using MFCC feature parameters. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.
Keywords
USAC; MFCC; GMM; Signal Classification;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 ISO/IEC SC29 WG11 N9519, Call for Proposals on Unified Speech and Audio Coding, 82nd MPEG Meeting, October, 2007.
2 송정욱, 오현오, 강홍구, "통합 음성/오디오 부호화를 위한 새로운 MPEG 참조 모델," 전자공학회논문지, 제47권 SP편, 제5호, 74-80쪽, 2010년 9월   과학기술학회마을
3 R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classificaion, Wiley-Interscience, 2001.
4 N.Scaringella, G. Zoia, and D.G.Stork, Pattern Classification Wiley-interscience, 2001.
5 J. Bergstra, N.Casagrande,D. Erhan, D. Eck, and B. Kegl, "Aggregate features and ADABOOST for music calssificatio." Machine Learning, vol. 65, no. 2, pp. 474-484, Dec. 2006.
6 Martin F. Mcknney, Jeroen Breebaart, "Features for audio and music calssification" in Proc. lnt. Conf. on Music lnfo. Retrieval (ISMIR-03), 2003.
7 K. West, S. Cox, "Features and classifiers for the automatic classification of musical audio signals," in Proc. lnt. Conf. on Music lnfo. Retrieval (ISMIR-08), 2004.
8 Bernd Geiser et al, "Candidate Proposal for ITU-T Super-wideband Speech and Audio Coding", ICASSP, pp.4121-4124. 2009.
9 M. Neuendorf, et al. ,"A novel scheme for low bitrate unified speech and audio coding-MPEG RM0," in Proceedings of the 126th AES Convention, Munich, Germany, May 2009.
10 원양희, 이형일, 강상원, "ARM Core(R)를 이용한 AMR-WB+오디오 부호화기의 실시간 구현," 전자공학회논문지, 제46권 제 3호, 119-124쪽, 2009년 5월   과학기술학회마을
11 B.Atal, "Automatic recognition of speakers from their voices" proc.IEEE vol.64 pp 460-475 apr.1976   DOI   ScienceOn
12 Thomas F. Quantieri, Discrete-Time Speech Signal Processing, Prentice Hall, 2001
13 J. Makinen, B. Bessette, S. Bruhn, P. Ojala, R. Salami, and A.Taleb, "AMR-WB+: a new audio coding standard for 3RD generation mobile audioservices," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), vol. 2, pp. 1109-1112, March 2005.
14 A.P.Dempster; N.M.Laird, et al.,"Maximum Likelihood from Incomplete Data via the EM Algorithm", Journal of the Royal Statistical Society. Series B (Methodological),Vol.39,No.1.
15 ITU-T Recommendation (1996). "Methods for subjective determination of transmission quality", P.800, 08.