DOI QR코드

DOI QR Code

A Study on SVM-Based Speaker Classification Using GMM-supervector

GMM-supervector를 사용한 SVM 기반 화자분류에 대한 연구

  • Received : 2020.11.16
  • Accepted : 2020.12.29
  • Published : 2020.12.31

Abstract

In this paper, SVM-based speaker classification is experimented with GMM-supervector. To create a speaker cluster, conventional speaker change detection is performed with the KL distance using the SNR-based weighting function. SVM-based speaker classification consists of two steps. In the first step, SVM-based classification between UBM and speaker models is performed, speaker information is indexed in each cluster, and then grouped by speaker. In the second step, the SVM-based classification between UBM and speaker models is performed by inputting the speaker cluster group. Linear and RBF are applied as kernel functions for SVM-based classification. As a result, in the first step, the case of applying the linear kernel showed better performance than RBF with 148 speaker clusters, MDR 0, FAR 47.3, and ER 50.7. The second step experiment result also showed the best performance with 109 speaker clusters, MDR 1.3, FAR 28.4, and ER 32.1 when the linear kernel was applied.

본 논문에서는 GMM-supervector를 특징 파라미터로 하는 SVM 기반 화자 분류에 대해서 실험하였다. 실험을 위한 화자 클러스터를 생성하기 위해서 기존의 SNR 기반 가중치를 반영한 KL거리 기반 화자변화검출을 실행하였다. SVM 기반 화자 분류는 2단계로 이루어져있다. 1단계는 UBM과 화자 모델들간의 SVM 기반 분류를 시행하여 각 클러스터에 화자 정보를 인덱싱한 다음 화자별로 그룹핑한다. 2단계는 화자 클러스터 그룹에 UBM과 화자모델들간의 SVM 기반 분류를 시행한다. SVM의 커널 함수로는 Linear와 RBF를 사용하였다. 실험결과, 1단계에서는 Linear 커널이 화자 클러스터 148개, MDR 0, FAR 47.3, ER 50.7로 좋은 성능으로 보였다. 2단계 실험결과도 Linear 커널이 화자 클러스터 109개, MDR 1.3, FAR 28.4, ER 32.1로 좋은 성능을 보였다.

Keywords

References

  1. Aaron E. Rosenberg, Ivan Magrin-Chagnolleau, S. Parthasarathy, and Qian Huang, "Speaker detection in broadcast news data," Proc. ICSLP '98, vol.4, pp.1339-1343, 1998.
  2. Joon-Beom Cho, Ji-eun Lee, Kyong-Rok Lee, "The Study on Speaker Change Verification Using SNR based weighted KL distancem," Journal of Convergence for Information Technology, vol.7, no.6, pp.159-166, 2017. DOI: 10.22156/CS4SMB.2017.7.6.159
  3. Joon-Beom Cho, Ji-eun Lee, Kyong-Rok Lee, "The Study on the verification of Speaker Change using GMM-UBM based KL distance," Journal of Convergence Society for SMB, vol.6, no.1, pp. 71-77, 2016. DOI: 10.22156/CS4SMB.2016.6.4.071
  4. W. M. Campbell, D. E. Sturim, D. A. Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Processing Letters, vol.13, no.5, pp.308-311, 2006. DOI: 10.1109/LSP.2006.870086
  5. W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, P. A. Torres-Carrasquillo, "Support vector machines for speaker and language recognitionm," Computer Speech & Language, vol.20, no.2, pp. 210-229, 2006. https://doi.org/10.1016/j.csl.2005.06.003
  6. Deepika Kancherla, Jyostna Devi Bodapati, Veeranjaneyulu N, "Effect of Different Kernels on the Performance of an SVM Based Classification," IJRTE, vol.5S4, no.7, pp.2277-3878, 2019.
  7. C. Cortes and V. Vapnik, "Support-vector network," Machine Learning, vol.20, No.3, pp. 273-297, 1995. DOI: 10.1023/A%3A1022627411411
  8. D. DeCoste, K. Wagstaff, "Alpha seeding for support vector machines," Proceedings of the 6th ACM SIGKDD international conf. on Knowledge discovery and data mining. ACM, pp.345-349, 2000. DOI: 10.1145/347090.347165
  9. Ngoc Nam BUI, Jin Young KIM, Tan Dat TRINH, "A Non-linear GMM KL and GUMI Kernel for SVM Using GMM-UBM Supervector in Home Acoustic Event Classification," IEICE TRANS. FUNDAMENTALS, vol.E97.A, no.8, pp. 1791-1794, 2014. DOI: 10.1587/transfun.E97.A.1791
  10. S. Theodoridis, K. Koutroumbas, "Support Vector Machines: The Non-linear Case, Pattern Recognition," Academic Press, pp.198-200, 2008.
  11. Mariette Awad, Rahul Khanna. "Efficient Learning Machines," Apress, pp.39-66, 2015.
  12. Mehmet Gonen, Ethem Alpaydin, "Multiple Kernel Learning Algorithms," Journal of Machine Learning Research, vol.12 pp.2211-2268, 2011.