DOI QR코드

DOI QR Code

Adaptive Kernel Function of SVM for Improving Speech/Music Classification of 3GPP2 SMV

  • Lim, Chung-Soo (Department of Electronic Engineering, Inha University, Institute of Information Science and Engineering Research, Mokpo National University) ;
  • Chang, Joon-Hyuk (Department of Electronic Engineering, Inha University, Department of Electronic Engineering, Hanyang University)
  • Received : 2010.12.24
  • Accepted : 2011.05.06
  • Published : 2011.12.31

Abstract

Because a wide variety of multimedia services are provided through personal wireless communication devices, the demand for efficient bandwidth utilization becomes stronger. This demand naturally results in the introduction of the variable bitrate speech coding concept. One exemplary work is the selectable mode vocoder (SMV) that supports speech/music classification. However, because it has severe limitations in its classification performance, a couple of works to improve speech/music classification by introducing support vector machines (SVMs) have been proposed. While these approaches significantly improved classification accuracy, they did not consider correlations commonly found in speech and music frames. In this paper, we propose a novel and orthogonal approach to improve the speech/music classification of SMV codec by adaptively tuning SVMs based on interframe correlations. According to the experimental results, the proposed algorithm yields improved results in classifying speech and music within the SMV framework.

Keywords

References

  1. 3GPP2 Spec., "Source-Controlled Variable-Rate Multimedia Wideband Speech Codec (VMR-WB), Service Option 62 and 63 for Spread Spectrum Systems," 3GPP2-C.S0052-A, vol. 1.0, Apr. 2005.
  2. Y. Gao et al., "The SMV Algorithm Selected by TIA and 3GPP2 for CDMA Applications," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., vol. 2, May 2002, pp. 709-712.
  3. S.-K. Kim and J.-H. Chang, "Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine," IEICE Trans. Fundamentals Electron., Commun. Comput. Sci., vol. E92-A, no. 2, Feb. 2009.
  4. X. Wang et al., "Infrared Human Face Auto Locating Based on SVM and a Smart Thermal Biometrics System," Proc. 6th Int. Conf. Intell. Syst. Design Appl., vol. 2, Oct. 2006, pp. 1066-1072.
  5. A. Ganapathiraju, J.E. Hamaker, and J. Picone, "Applications of Support Vector Machines to Speech Recognition," IEEE Trans. Signal Process., vol. 52, no. 8, Aug. 2004, pp. 2348-2355. https://doi.org/10.1109/TSP.2004.831018
  6. L.-P. Bi et al., "New Heuristic for Determination Gaussian Kernel's Parameter," Proc. Int. Conf. Mach. Learning Cybern., vol. 7, Aug. 2005, pp. 4299-4304.
  7. S.S. Keerthi and C.-J. Lin, "Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel," Neural Comput., vol. 15, no. 7, July 2003, pp. 1667-1689. https://doi.org/10.1162/089976603321891855
  8. J. Tian and L. Zhao, "Weighted Gaussian Kernel with Multiple Widths and Support Vector Classifications," Proc. Int Symp. Info. Eng. Electron. Commerce, May 2009, pp. 379-382.
  9. N.E. Ayat, M. Cheriet, and C.Y. Suen, "Automatic Model Selection for the Optimization of SVM Kernels," Pattern Recognition, vol. 38, no. 10, Oct. 2005, pp. 1733-1745. https://doi.org/10.1016/j.patcog.2005.03.011
  10. S.-K. Kim and J.-H. Chang, "Discriminative Weight Training for Support Vector Machine-Based Speech/Music Classification in 3GPP2 SMV Codec," IEICE Trans. Fundamentals of Electron., Commun. Comput. Sci., vol. E93-A, no. 1, Jan. 2010, pp. 316-319. https://doi.org/10.1587/transfun.E93.A.316
  11. E. Scheirer and M. Slaney, "Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., vol. 2, Apr. 1997, pp. 1331-1334.
  12. S.C. Greer and A. Dejaco, "Standardization of the Selectable Mode Vocoder," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., vol. 2, May 2001, pp. 953-956.
  13. C.V. Goudar et al., "SMVLite: Reduced Complexity Selectable Mode vocoder," Proc. IEEE Int. Conf. Speech Signal Process., vol. 1, May 2006, pp. 701-704.
  14. P. Vary and R. Martin, "Digital Speech Transmission: Enhancement, Coding and Error Concealment," Proc. IEEE Int. Conf. Acoutics, Speech, Signal Process., vol. 1, May 2006, pp. 701-704.
  15. W.M. Fisher, G.R. Doddington, and K.M. Goudie-Marshall, "The DARPA Speech Recognition Research Database: Specifications and Status," Proc. DARPA Workshop Speech Recognition, Feb. 1986, pp. 93-99.