Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Codec Based on Support Vector Machine

SMV코덱의 음성/음악 분류 성능 향상을 위한 Support Vector Machine의 적용

  • Published : 2008.11.25

Abstract

In this paper, we propose a novel a roach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the support vector machine (SVM). The SVM makes it possible to build on an optimal hyperplane that is separated without the error where the distance between the closest vectors and the hyperplane is maximal. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are a lied to the SVM are selected from relevant parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

본 논문에서는 support vector machine (SVM)을 이용하여 기존의 3GPP2 selectable mode vocoder (SMV)코덱의 음성/음악 분류 성능을 향상시키는 방법을 제시한다. SVM은 통계적 학습 이론으로 훈련 데이터 사이의 최적 분류 초평면을 찾아내 최적화된 이진 분류를 보여준다. SMV의 음성/음악 실시간 분류 알고리즘에서 사용된 특징벡터와 분류방법을 분석하고, 이를 기반으로 분류성능향상을 위해 통계적 학습 이론인 SVM을 도입한다. 구체적으로, SMV의 음성/음악 분류알고리즘에서 사용되어진 특징벡터만을 선택적으로 사용하여 효과적으로 SVM을 구성한 분류기법을 제시한다. SMV의 음성/음악 분류에 적용한 SVM의 성능 평가를 위해 SMV 원래의 분류알고리즘과 비교하였으며, 다양한 음악장르에 대해 시스템의 성능을 평가한 결과 SVM을 이용하였을 때 기존의 SMV의 방법보다 우수한 음성/음악 분류 성능을 보였다.

Keywords

References

  1. Y. Gao, E. Shlomot, A. Benyassine, J. Thyssen, Huan-yu Su, and C. Murgia, "The SMV Algorithm Selected by TIA and 3GPP2 for CDMA Applications," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 709-712, May 2001
  2. 3GPP2 Spec., "Source-controlled variable-rate multimedia wideband speech codec (VMR-WB), service option 62 and 63 for spread spectrum systems," 3GPP2-C.S0052-A, v.1.0, Apr. 2005
  3. J. Saunders, "Real-time discrimination of broadcast speech/music," Proc. IEEE International Conference on Acoustics, Speech, and Processing, vol. 2, pp. 93-996, May 1996
  4. W. Q. Wang, W. Gao, and D. W. Ying, "A fast and robust speech/music Discrimination Approach," Proc. International Conference on Information, Communications and Signal Processing, vol. 3, pp. 1325-1329, Dec. 2003
  5. S. Craig Greer, and A. Dejaco, "Standardization of the selectable mode vocoder," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 953-956, May 2001
  6. 3GPP2 Spec., "Selectable Mode Vocoder (SMV) Service Option for Wideband Spread Spectrum Communication Systems," 3GPP2-C.S0030-0, v3.0, Jan. 2004
  7. X. Wang, J. Chen, P. Wang, Z. Huang, "Infrared Human Face Auto Locating Based on SVM and A Smart Thermal Biometrics System," Proc. Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06), vol. 2, pp. 1066-1072, Oct. 2006
  8. A. Ganapathiraju, J. E. Hamaker, J. Picone, "Applications of support vector machines to speech recognition," IEEE Trans. Signal Processing, vol. 52, pp. 2348-2355, Aug. 2004 https://doi.org/10.1109/TSP.2004.831018
  9. V. N. Vapnik, "An overview of statistical learning theory," IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 988 -999, Sep. 1999 https://doi.org/10.1109/72.788640
  10. N. Cristianini and J. Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods. Cambridge Univ. Press, 2000
  11. J. H. Song, K. H. Lee, J.-H. Chang, J. K. Kim, and N. S. Kim, "Analysis and Improvement of Speech/Music Classication for 3GPP2 SMV based on GMM," Proc. IEEE Signal Processing Letters, vol. 15, pp. 103-106, Jan. 2008 https://doi.org/10.1109/LSP.2007.911184