[KSCI] Korea Science Citation Index Service

Sequential Speaker Classification Using Quantized Generic Speaker Models

Kwon, Soon-Il (Division of Systems Technology, Korea Institute of Science and Technology)

Publication Information

Journal of the Institute of Electronics Engineers of Korea CI / v.44, no.1, 2007 , pp. 26-32 More about this Journal

Abstract

In sequential speaker classification, the lack of prior information about the speakers poses a challenge for model initialization. To address the challenge, a predetermined generic model set, called Sample Speaker Models, was previously proposed. This approach can be useful for accurate speaker modeling without requiring initial speaker data. However, an optimal method for sampling the models from a generic model pool is still required. To solve this problem, the Speaker Quantization method, motivated by vector quantization, is proposed. Experimental results showed that the new approach outperformed the random sampling approach with 25% relative improvement in error rate on switchboard telephone conversations.

Keywords

연속적 화자분류;범용 화자모델;표본 화자모델;화자 양자화;

Citations & Related Records

Reference

1	A. Jain, P. Moulin, M. I. Miller and K. Ramchandran, 'Information-Theoretic Bounds on Target Recognition Performance Based on Degraded Image Data,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 24, pp. 1153-1166, 2002 DOI ScienceOn
2	T. Hastie, H. R. Tibshirani and J. Friedman, 'The Elements of Statistical Learning,' Springer, New York, pp. 496-498, 2001
3	R. V. Hogg and E. A. Tanis, 'Probability and Statistical Inference,' 6th ed. Prentice Hall, New Jersey, pp.85-102, 2001
4	T. M. Cover and J.~A. Thomas, 'Elements of Information Theory, Wiley Interscience, New York, pp. 18- 19, 1991
5	M. Do, 'Fast Approximation of Kullback-Leibler Distance for Dependence Trees and Hidden Markov Models,' IEEE Signal Processing Letters, Vol. 10, pp. 115-118, 2003 DOI ScienceOn
6	R.M. Gray and D. L. Neuhoff, 'Quantization,' IEEE Trans. on Information Theory, Vol. 44, pp. 2325-2383, 1998 DOI ScienceOn
7	J. P. Campbell, 'Speaker recognition: A tutorial,' in Proc. of IEEE, Vol. 85, pp. 1436-1462, 1997 DOI ScienceOn
8	J. Wu and E. Chang, 'Cohorts Based Custom Models for Rapid Speaker and Dialect Adaptation,' in Proc. of Eurospeech, pp. 1261-1264, Aalborg, Denmark, 2001
9	T. Wu, L. Lu, K. Chen and H. Zhang, 'UBM-Based Real-Time Speaker Segmentation for Broadcasting News,' in Proc. of IEEE International Conf. on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 193-196, Hong Kong, China, 2003 DOI
10	J. Yang, X. Zhu, R. Gross, J. Kominek, Y. Pan and A. Waibel, 'Multimodal People ID for a Multimedia Meeting Browser,' in Proc. of 7th ACM International Conf. on Multimedia, Part 1, pp. 159-168, 1999 DOI
11	M. Liu, E. Chang and B. Q. Dai, 'Hierarchical Gaussian Mixture Model for Speaker Verification,' in Proc. of International Conf. on Spoken Language Processing, Vol. 2, pp. 1353-1356, Denver, U.S.A., 2002
12	L. Lu, H. J. Zhang and H. Jiang, 'Content Analysis for Audio Classification and Segmemtation,' IEEE Trans. on Speech and Audio Processing, Vol. 10, pp. 504-516, 2002 DOI ScienceOn
13	M. Nishida and T. Kawahara, 'Unsupervised Speaker Indexing Using Speaker Model Selection Based on Bayesian Information Criterion,' in Proc. of IEEE International Conf. on Acoustics, Speech and Signal Processing, Vol. 1, pp. 172-175, Hong Kong, China, 2003
14	S. Kwon and S. Narayanan, 'Speaker Model Quantization for Unsupervised Speaker Indexing,' in Proc. of International Conf. Spoken Language Processing, WeC2102p.18, Jeju, Korea, 2004
15	S. Kwon and S. Narayanan, 'Unsupervised Speaker Indexing Using Generic Models,' IEEE Trans. on Speech and Audio Processing, Vol. 13, Issue 5, Part 2, pp.1004-1013, 2005 DOI ScienceOn
16	T. Kinnunen, T. Kilpelainen and P. Franti, 'Comparison of Clustering Algorithms in Speaker Identification,' in Proc. of International Conf. of Signal Processing and Communications (SPC 2000), pp. 222-227, 2000
17	S. Kwon and S. Narayanan, 'A Study of Generic Models for Unsupervised On-Line Speaker Indexing,' in Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, pp. 423-428, St. Thomas, U.S. Virgin Islands, 2003 DOI

KSCI

Sequential Speaker Classification Using Quantized Generic Speaker Models 양자화 된 범용 화자모델을 이용한 연속적 화자분류

Sequential Speaker Classification Using Quantized Generic Speaker Models