Browse > Article
http://dx.doi.org/10.7776/ASK.2011.30.8.461

Efficient Implementation of SVM-Based Speech/Music Classification on Embedded Systems  

Lim, Chung-Soo (목포대학교)
Chang, Joon-Hyuk (한양대학교 융합전자공학부)
Abstract
Accurate classification of input signals is the key prerequisite for variable bit-rate coding, which has been introduced in order to effectively utilize limited communication bandwidth. Especially, recent surge of multimedia services elevate the importance of speech/music classification. Among many speech/music classifier, the ones based on support vector machine (SVM) have a strong selling point, high classification accuracy, but their computational complexity and memory requirement hinder their way into actual implementations. Therefore, techniques that reduce the computational complexity and the memory requirement is inevitable, particularly for embedded systems. We first analyze implementation of an SVM-based classifier on embedded systems in terms of execution time and energy consumption, and then propose two techniques that alleviate the implementation requirements: One is a technique that removes support vectors that have insignificant contribution to the final classification, and the other is to skip processing some of input signals by virtue of strong correlations in speech/music frames. These are post-processing techniques that can work with any other optimization techniques applied during the training phase of SVM. With experiments, we validate the proposed algorithms from the perspectives of classification accuracy, execution time, and energy consumption.
Keywords
Support vector machine (SVM); Speech/Music classification algorithm; Embedded system;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. Gao, E. Shlomot, A. Benyassine, J. Hyssen, Huan-yu Su, and C. Murgia, "The SMV algorithm selected by TIA and 3GPP2 for CDMA appications," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing , vol. 2, pp. 709-712, 2001.
2 A. Bugatti, A. Flammini, and P. Migliorati, "Audio classification in speech and music: a comparison between statistical and a neural approach," EURASIP Journal on Appliled Signal Processing, vol. 2002, no. 4, pp. 372-378, 2002.   DOI   ScienceOn
3 J. Saunders, "Real-time discrimination of broadcast speech/musicspeech/music," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 7-10, 1996.
4 S. -K. Kim and J. -H. Chang, "Speech/music classification enhancement for 3GPP2 SMV codec based on support vector machine," IEICE Trans. Fundamentals of Electronics, Communications and Computer Sciences, Vol. E92-A, no. 2, 2009.
5 S. -K. Kim and J. -H. Chang, "Discriminative weight training for support vector machine-based speech/music classification in 3GPP2 SMV codec," IEICE Trans. Fundamentals of Electronics, Communications and Computer Sciences , vol. E93-A, no. 1, pp. 316-319, 2010.   DOI   ScienceOn
6 H. Lee and J. Jeong, "Early termination scheme for binary block motion estimation," IEEE Trans. Consumer Electronics, vol. 53, no. 4, pp. 1682-1686, 2007.   DOI   ScienceOn
7 C. Burges, "Simplified support vector decision rules," in Proc. International Conference on Machine Learning, pp. 71-77, 1996.
8 Y. Zhan and D. Shen, "Design efficient support vector machine for fast classification," Pattern Recognition, vol. 38, no. 1, pp. 157-161, 2005.   DOI   ScienceOn
9 T. Ho, "An efficient method for simplifying support vector machines," in Proc. International Conference on Machine Learning, pp. 617-624, 2005.
10 N. E. Ayat, M. Cheriet, and C. Y. Suen, "Automatic model selection for the optimization of SVM kernel," Pattern Recognition, vol. 38, no. 10, pp. 1733-1745, 2005.   DOI   ScienceOn
11 T. Austin, T. Mudge, and D. Grunwald, Sim-panalyzer. http://www.eecs.umich.edu/-panalyzer/
12 W. M. Fisher, G. R. Doddington and K. M. Goudie-Marshall, "The DARPA speech recognition research database: Specifications and status," in Proc. DARPA Workshop Speech Recognition , pp. 93-99, 1986.
13 3GPP2 Spec., "Source-controlled variable-rate multimedia wideband speech codec (VMR-WB), service option 62 and 63 for spread spectrum systems," 3GPP2-C.S0052-A, vol. 1.0, Apr. 2005.