[KSCI] Korea Science Citation Index Service

Parts-Based Feature Extraction of Spectrum of Speech Signal Using Non-Negative Matrix Factorization

Park, Jeong-Won (Department of Electronic Engineering, Dong-A University)
Kim, Chang-Keun (Department of Electronic Engineering, Dong-A University)
Lee, Kwang-Seok (Department of Electronic Engineering, Jinju National University)
Koh, Si-Young (School of Electronic Information and Communication Engineering, Kyungil University)
Hur, Kang-In (Department of Electronic Engineering, Dong-A University)

Publication Information

Journal of information and communication convergence engineering / v.1, no.4, 2003 , pp. 209-212 More about this Journal

Abstract

In this paper, we proposed new speech feature parameter through parts-based feature extraction of speech spectrum using Non-Negative Matrix Factorization (NMF). NMF can effectively reduce dimension for multi-dimensional data through matrix factorization under the non-negativity constraints, and dimensionally reduced data should be presented parts-based features of input data. For speech feature extraction, we applied Mel-scaled filter bank outputs to inputs of NMF, than used outputs of NMF for inputs of speech recognizer. From recognition experiment result, we could confirm that proposed feature parameter is superior in recognition performance than mel frequency cepstral coefficient (MFCC) that is used generally.

Keywords

Non-Negative Matrix Factorization; Parts-based Feature Extraction; Mel-scaled Filter Bank Output;

Citations & Related Records

Reference

1	Daniel D. Lee, H. Sebastian Seung, 'Algorithms for Non-Negative Matrix Factorization', in Advances in Neural Information Processing System 13, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds., 2001
2	S. Tsuge, M. Shishibori, S. Kurojwa, K. Kita, 'Dimensionally Reduction Using Non-Negative Matrix Factorization for Information Retrieval', Systems, Man, and Cybermetics, 2001 IEEE International Conference on, vol. 2, 2001, pp. 960-965
3	Simon Haykin, 'Neural Networks a Comprehensive Foundation', Prentice Hall, 1999
4	L. R. Rabiner, B. H. Juang, 'Fundamentals of Speech Recognition', Prentice Hall, 1993
5	Hoyer. P. O, 'Non-Negative Sparse Coding', Neural Networks for Signal Processing, 2002. Proceedings of the 2002 $12^{th}$ IEEE Workshop on, 2002, pp. 557-565
6	Daniel D. Lee and H. Sebastian Seung, 'Learning the parts of objects by non-negative matrix factorization,' Nature vol. 401, Oct. 21, 1999, pp-788-791 DOI ScienceOn
7	D. Guillamet, B. Schiele, J. Vitria, 'Analyzing nonnegative matrix factorization for image classification', Pattern Recognition, 2002. Proceedings. 16th international Conference on, vol. 2, Aug. 2002, pp. 116-119
8	Sven Behnke, 'Discovering hierarchical speech features using convolutional non-negative matrix factorization', IJCNN'03, vol. 4, Oct. 14, 2003, pp. 2758-2763
9	L. R. Rabiner, R. W. Schafer, 'Digital Processing of Speech Signals', Prentice Hall, 1978
10	J. W. Park, P. W. Kim, C. K. Kim, K. I. Hur, 'Adoption of Support Vector Machine and Independent Component Analysis for Implementation of Speech Recognizer', Summer Conference of lEEK, vol. 26, no.1, July, 2003, pp. 2164-2167
11	H. Y. Choi, S. J. Choi, 'Learning the Sparse Codes of Speeches via Non-Negative Matrix Factorization, CVPR 2002