DOI QR코드

DOI QR Code

On-Line Linear Combination of Classifiers Based on Incremental Information in Speaker Verification

  • Huenupan, Fernando (Speech Processing and Transmission Laboratory, Department of Electrical Engineering, Universidad de Chile) ;
  • Yoma, Nestor Becerra (Speech Processing and Transmission Laboratory, Department of Electrical Engineering, Universidad de Chile) ;
  • Garreton, Claudio (Speech Processing and Transmission Laboratory, Department of Electrical Engineering, Universidad de Chile) ;
  • Molina, Carlos (Speech Processing and Transmission Laboratory, Department of Electrical Engineering, Universidad de Chile)
  • Received : 2009.05.27
  • Accepted : 2010.01.12
  • Published : 2010.06.30

Abstract

A novel multiclassifier system (MCS) strategy is proposed and applied to a text-dependent speaker verification task. The presented scheme optimizes the linear combination of classifiers on an on-line basis. In contrast to ordinary MCS approaches, neither a priori distributions nor pre-tuned parameters are required. The idea is to improve the most accurate classifier by making use of the incremental information provided by the second classifier. The on-line multiclassifier optimization approach is applicable to any pattern recognition problem. The proposed method needs neither a priori distributions nor pre-estimated weights, and does not make use of any consideration about training/testing matching conditions. Results with Yoho database show that the presented approach can lead to reductions in equal error rate as high as 28%, when compared with the most accurate classifier, and 11% against a standard method for the optimization of linear combination of classifiers.

Keywords

References

  1. J. Kittler et al., "On Combining Classifiers," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, 1998, pp. 226-239. https://doi.org/10.1109/34.667881
  2. R. Duda and P. Hart, Pattern Classification and Scene Analysis, NY: John Wiley and Sons, 1973.
  3. L.I. Kuncheva, "Using Measures of Similarity and Inclusion for Multiple Classifier Fusion by Decision Templates," Fuzzy Sets and Systems, vol. 122, no. 3, 2001, pp. 401-407. https://doi.org/10.1016/S0165-0114(99)00161-X
  4. L.I. Kuncheva, "A Theoretical Study on Six Classifier Fusion Strategies," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, 2002, pp. 281-286. https://doi.org/10.1109/34.982906
  5. Y. Chen, C.Y. Wan, and L.S. Lee, "Entropy-Based Feature Parameter Weighting for Robust Speech Recognition," Int. Conf. Acoustics, Speech and Signal Process., 2006, Toulouse, France.
  6. B. Fassinut-Mombot and J.-B. Choquel, "A New Probabilistic and Entropy Fusion Approach for Management of Information Sources," Information Fusion, vol. 5, 2004, pp. 35-47. https://doi.org/10.1016/j.inffus.2003.06.001
  7. M. Saerens and F. Fouss, "Yet Another Method for Combining Classifiers Outputs: A Maximum Entropy Approach," Lecture Notes in Computer Science, vol. 3077, 2004, pp. 82-91.
  8. H.J. Kang and S.W. Lee, "Combining Classifers Based on Minimization of a Bayes Error Rate," Proc. 5th Int. Conf. Document Anal. Recognition, 1999, Bangalore, India, pp. 398-401.
  9. G. Gravier et al., "Maximum Entropy and Mce Based HMM Stream Weight Estimation for Audio-Visual Asr," ICASSP, 2002.
  10. S. Tamura, K. Iwano, and S. Furui, "Toward Robust Multimodal Speech Recognition," LKR, 2005, Tokyo, Japan, pp. 163-166.
  11. A.C.S. Chung and H.C. Shen, "Dependence in Sensory Data Combination," Int. Conf. Intelligent Robots and Systems, 1998, Victoria, BC, Canada, pp. 1676-1681.
  12. Y. Zhou and H. Leung, "Minimum Entropy Approach for Multisensor Data Fusion," IEEE Signal Process. Workshop Higher-Order Statistics, 1997, pp. 336-339.
  13. A.L. Berger, S.A. Della Pietra, and V.J. Della Pietra, "A Maximum Entropy Approach to Natural Language Processing," Computational Linguistics, vol. 22, 1996, pp. 42-71.
  14. B. Nasersharif and A. Akbari, "Improved HMM Entropy for Robust Sub-Band Speech Recognition," Eusipco, 2005.
  15. M. Matton et al., "Maximum Mutual Information Training of Distance Measures for Template Based Speech Recognition," Int. Conf. Speech and Computer, 2005, pp. 511-514.
  16. M.K. Omar et al., "An Evaluation of Using Mutual Information for Selection of Acoustic-Features Representation of Phonemes for Speech Recognition," ICSLP, 2002, pp. 2129-2132.
  17. K.R. Farrel, "Text-Dependent Speaker Verification Using Data Fusion," ICASSP, 1995, pp. 349-352.
  18. K.R. Farrel et al., "Sub-Word Speaker Verification Using Data Fusion Methods," IEEE Workshop Neural Networks Signal Process., 1997, pp. 531-540.
  19. B. Yegnanarayana et al., "Combining Evidence from Source, Suprasegmental and Spectral Features for a Fixed-Text Speaker Verification System," IEEE Trans. Speech Audio Process., vol. 13, no. 4, 2005, pp. 578-582.
  20. N. Brümmer et al., "Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006," IEEE Trans. Audio, Speech, and Language Process., vol. 15, no. 7, 2007, pp. 2072-2084. https://doi.org/10.1109/TASL.2007.902870
  21. M.F. Benzeghiba and H. Bourlard, "Hybrid HMM/Ann and Gmm Combination for User-Customized Password Speaker Verification," ICASSP, 2003, pp. 225-228.
  22. M.W. Mak, M.C. Cheung, and S.Y. Kung, "Robust Speaker Verification from Gsm-Transcoded Speech Based on Decision Fusion and Feature Transformation," ICASSP, 2003, pp. 745-748.
  23. D. Genoud et al., "Combining Methods to Improve Speaker Verification," ICSLP, 1996, pp. 1756-1759.
  24. F. Huenupan et al., "Confidence Based Multiple Classifier Fusion in Speaker Verification," Pattern Recognition Lett., vol. 29, no. 7, 2008, pp. 957-966. https://doi.org/10.1016/j.patrec.2008.01.015
  25. F. Bimbot et al., "A Tutorial on Text-Independent Speaker Verification," EURASIP J. Applied Signal Process., 2004, pp. 430-451.
  26. A.V. Lazo and P.N. Rathie, "On the Entropy of Continuous Probability Distributions," IEEE Trans. Inf. Theory, vol. IT-24, 1978, pp. 120-122.
  27. R. Gray, Entropy and Information Theory, NY: Springer-Verlag, 1990.
  28. C. Molina et al., "Unsupervised Re-Scoring of Observation Probability Based on Maximum Entropy Criterion by Using Confidence Measure with Telephone Speech," Interspeech, Australia, 2008, pp. 1016-1019.
  29. J. Campbell and A. Higgins, "YOHO Speaker Verification," Linguistic Data Consortium, 1994.
  30. S. Pigeon, P. Druyts, and P. Verlinde, "Applying Logistic Regression to the Fusion of the Nist'99 1-Speaker Submissions," Digit. Signal Process., vol. 10, 2000, pp. 237-248. https://doi.org/10.1006/dspr.1999.0358
  31. S. Furui, "Recent Advances in Speaker Recognition," Pattern Recognition Letters, vol. 18, 1997, pp. 859-872. https://doi.org/10.1016/S0167-8655(97)00073-1
  32. C.J.C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, vol. 2, 1998, pp. 121-167. https://doi.org/10.1023/A:1009715923555
  33. W.M. Campbell et al., "High-Level Speaker Verification with Support Vector Machines," ICASSP, 2004, Montreal, Canada, pp. 73-76.
  34. J. Rice, Mathematical Statistics and Data Analysis, Florence, Ky., USA: Brooks Cole, 1995, pp. 507-570.
  35. X. Dong and W. Zhaohui, "Speaker Recognition Using Continuous Density Support Vector Machines," Electron. Letters, vol. 37, no. 17, 2001, pp. 1099-1101. https://doi.org/10.1049/el:20010741
  36. Y. Gu and T. Thomas, "A Hybrid Score Measurement for HMM-Based Speaker Verification," ICASSP, 1999, pp. 317-320.
  37. Z. Lei, Y. Yang, and Z. Wu, "An Ubm-Based Reference Space for Speaker Recognition," Int. Conf. Pattern Recognition, 2006, pp. 318-321.
  38. Y. Liu, M. Russell, and M. Carey, "The Role of Dynamic Features in Text-Dependent and Independent Speaker Verification," ICASSP, 2006, pp. 669-672.
  39. B.L. Pellom and J.H.L. Hansen, "An Efficient Scoring Algorithm for Gaussian Mixture Model Based Speaker Identification," IEEE Signal Process. Lett., vol. 5, no. 11, 1998, pp. 281-284. https://doi.org/10.1109/97.728467
  40. L.I. Kuncheva and C. Whitaker, "Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy," Machine Learning, vol. 51, no. 2, 2003, pp. 181-207. https://doi.org/10.1023/A:1022859003006