Browse > Article

Sequential Speaker Classification Using Quantized Generic Speaker Models  

Kwon, Soon-Il (Division of Systems Technology, Korea Institute of Science and Technology)
Publication Information
Abstract
In sequential speaker classification, the lack of prior information about the speakers poses a challenge for model initialization. To address the challenge, a predetermined generic model set, called Sample Speaker Models, was previously proposed. This approach can be useful for accurate speaker modeling without requiring initial speaker data. However, an optimal method for sampling the models from a generic model pool is still required. To solve this problem, the Speaker Quantization method, motivated by vector quantization, is proposed. Experimental results showed that the new approach outperformed the random sampling approach with 25% relative improvement in error rate on switchboard telephone conversations.
Keywords
연속적 화자분류;범용 화자모델;표본 화자모델;화자 양자화;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Jain, P. Moulin, M. I. Miller and K. Ramchandran, 'Information-Theoretic Bounds on Target Recognition Performance Based on Degraded Image Data,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 24, pp. 1153-1166, 2002   DOI   ScienceOn
2 T. Hastie, H. R. Tibshirani and J. Friedman, 'The Elements of Statistical Learning,' Springer, New York, pp. 496-498, 2001
3 R. V. Hogg and E. A. Tanis, 'Probability and Statistical Inference,' 6th ed. Prentice Hall, New Jersey, pp.85-102, 2001
4 T. M. Cover and J.~A. Thomas, 'Elements of Information Theory, Wiley Interscience, New York, pp. 18- 19, 1991
5 M. Do, 'Fast Approximation of Kullback-Leibler Distance for Dependence Trees and Hidden Markov Models,' IEEE Signal Processing Letters, Vol. 10, pp. 115-118, 2003   DOI   ScienceOn
6 R.M. Gray and D. L. Neuhoff, 'Quantization,' IEEE Trans. on Information Theory, Vol. 44, pp. 2325-2383, 1998   DOI   ScienceOn
7 J. P. Campbell, 'Speaker recognition: A tutorial,' in Proc. of IEEE, Vol. 85, pp. 1436-1462, 1997   DOI   ScienceOn
8 J. Wu and E. Chang, 'Cohorts Based Custom Models for Rapid Speaker and Dialect Adaptation,' in Proc. of Eurospeech, pp. 1261-1264, Aalborg, Denmark, 2001
9 T. Wu, L. Lu, K. Chen and H. Zhang, 'UBM-Based Real-Time Speaker Segmentation for Broadcasting News,' in Proc. of IEEE International Conf. on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 193-196, Hong Kong, China, 2003   DOI
10 J. Yang, X. Zhu, R. Gross, J. Kominek, Y. Pan and A. Waibel, 'Multimodal People ID for a Multimedia Meeting Browser,' in Proc. of 7th ACM International Conf. on Multimedia, Part 1, pp. 159-168, 1999   DOI
11 M. Liu, E. Chang and B. Q. Dai, 'Hierarchical Gaussian Mixture Model for Speaker Verification,' in Proc. of International Conf. on Spoken Language Processing, Vol. 2, pp. 1353-1356, Denver, U.S.A., 2002
12 L. Lu, H. J. Zhang and H. Jiang, 'Content Analysis for Audio Classification and Segmemtation,' IEEE Trans. on Speech and Audio Processing, Vol. 10, pp. 504-516, 2002   DOI   ScienceOn
13 M. Nishida and T. Kawahara, 'Unsupervised Speaker Indexing Using Speaker Model Selection Based on Bayesian Information Criterion,' in Proc. of IEEE International Conf. on Acoustics, Speech and Signal Processing, Vol. 1, pp. 172-175, Hong Kong, China, 2003
14 S. Kwon and S. Narayanan, 'Speaker Model Quantization for Unsupervised Speaker Indexing,' in Proc. of International Conf. Spoken Language Processing, WeC2102p.18, Jeju, Korea, 2004
15 S. Kwon and S. Narayanan, 'Unsupervised Speaker Indexing Using Generic Models,' IEEE Trans. on Speech and Audio Processing, Vol. 13, Issue 5, Part 2, pp.1004-1013, 2005   DOI   ScienceOn
16 T. Kinnunen, T. Kilpelainen and P. Franti, 'Comparison of Clustering Algorithms in Speaker Identification,' in Proc. of International Conf. of Signal Processing and Communications (SPC 2000), pp. 222-227, 2000
17 S. Kwon and S. Narayanan, 'A Study of Generic Models for Unsupervised On-Line Speaker Indexing,' in Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, pp. 423-428, St. Thomas, U.S. Virgin Islands, 2003   DOI