[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.22156/CS4SMB.2017.7.6.159

The Study on Speaker Change Verification Using SNR based weighted KL distance

Cho, Joon-Beom (Department of Nursing, Nambu University)
Lee, Ji-eun (Department of Living physical Training Special Study, Chunnam Techno University)
Lee, Kyong-Rok (Department of IT & Design, Nambu University)

Publication Information

Journal of Convergence for Information Technology / v.7, no.6, 2017 , pp. 159-166 More about this Journal

Abstract

In this paper, we have experimented to improve the verification performance of speaker change detection on broadcast news. It is to enhance the input noisy speech and to apply the KL distance $D_s$ using the SNR-based weighting function $w_m$ . The basic experimental system is the verification system of speaker change using GMM-UBM based KL distance D(Experiment 0). Experiment 1 applies the input noisy speech enhancement using MMSE Log-STSA. Experiment 2 applies the new KL distance $D_s$ to the system of Experiment 1. Experiments were conducted under the condition of 0% MDR in order to prevent missing information of speaker change. The FAR of Experiment 0 was 71.5%. The FAR of Experiment 1 was 67.3%, which was 4.2% higher than that of Experiment 0. The FAR of experiment 2 was 60.7%, which was 10.8% higher than that of experiment 0.

Keywords

Speaker Change Detection; Kullback Leibler distance; Speech Enhancement; Minimum Mean Square Error Log-Spectral Amplitude Estimator; Signal to Noise Ratio;

Citations & Related Records

Reference

1	Y. Ephraim & D. Malah. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 33(2), 443-445. DOI : 10.1109/icmcs.2014.6911142 DOI
2	K. Paliwal, B. Schwerin & K. Wojcicki. (2012). Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator. Speech Communication, 54(2), 282-305. DOI : 10.1016/j.specom.2011.09.003 DOI
3	J. B. Cha. (2017). Minimum Mean Square Error, Glossary of ICT. Ktword. www.ktword.co.kr
4	B. A. Soni & K. Vaghela. (2017). Spectral Subtraction and MMSE : A Hybrid Approach For Speech Enhancement. International Reaserch Journal of Engineering and Technology, 4(4), 2340-2343.
5	R. Gray, A. Buzo, A. Gray & Y. Matsuyama. (1980). Distortion measures for speech processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 367-376. DOI : 10.1109/TASSP.1980.1163421 DOI
6	I. S. Gradshteyn & Z. M. Ryzhik. (1980). Table of integrals, series, and products. New York : Academic Press.
7	T. Y. Wu, L. Lu, K. Chen & H. Zhang. (2003). Universal Background Models for Real-time Speaker Change Detection. In MMM (pp. 135-149). Russia : MMM.
8	J. P. Campbell. (1997). Speaker recognition : A tutorial. Proceedings of the IEEE, 85(9), 1437-1462. USA : IEEE. DOI : 10.1109/5.628714 DOI
9	P. C. Loizou. (2013). Speech enhancement : theory and practice. USA : CRC press.
10	V. O. Alan & C. Ve. George. (2010). CHAPTER 8 Estimation with Minimum Mean Square Error. MIT Open Course Ware. https://ocw.mit.edu
11	L. Lu & H. J. Zhang. (2002). Speaker change detection and tracking in real-time news broadcasting analysis. In Proceedings of the tenth ACM international conference on Multimedia (pp. 602-610). USA : ACM. DOI : 10.1145/641007.641127 DOI
12	J. B. Cho, J. E. Lee & K. R. Lee. (2016). The Study on the Verification of Speaker Change using GMM-UBM based KL distance. Journal of Convergence for Information Technology, 6(1), 71-77. DOI : 10.22156/cs4smb.2016.6.4.071 DOI
13	Y. Ephraim & D. Malah. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109-1121. DOI : 10.1109/tassp.1984.1164453 DOI
14	M. J. Alam1, P. Kenny1, P. Dumouchel & D. O'Shaughnessy. (2014). Noise Spectrum Estimation using Gaussian Mixture Model-based Speech Presence Probability for Robust Speech Recognition. INTERSPEECH 2014, 2759-2763. Singapore : INTERSPEECH.
15	J. S. Lim & A. V. Oppenheim. (1979). Enhancement and bandwidth compression of noisy speech. Proceedings of the IEEE, 67(12), 1586-1604. USA : IEEE. DOI : 10.21236/ada073139 DOI

KSCI

The Study on Speaker Change Verification Using SNR based weighted KL distance SNR 기반 가중 KL 거리를 활용한 화자 변화 검증에 관한 연구

The Study on Speaker Change Verification Using SNR based weighted KL distance