Classification of General Sound with Non-negativity Constraints

비음수 제약을 통한 일반 소리 분류

  • 조용춘 (삼성전자 DM총괄 DM연구소) ;
  • 최승진 (포항공과대학교 컴퓨터공학과) ;
  • 방승양 (포항공과대학교 컴퓨터공학과)
  • Published : 2004.10.01

Abstract

Sparse coding or independent component analysis (ICA) which is a holistic representation, was successfully applied to elucidate early auditor${\gamma}$ processing and to the task of sound classification. In contrast, parts-based representation is an alternative way o) understanding object recognition in brain. In this thesis we employ the non-negative matrix factorization (NMF) which learns parts-based representation in the task of sound classification. Methods of feature extraction from the spectro-temporal sounds using the NMF in the absence or presence of noise, are explained. Experimental results show that NMF-based features improve the performance of sound classification over ICA-based features.

전체관적인 표현방법인 희소 코딩 또는 독릴 성분 분해(ICA)는 이전의 청각의 처리와 소리 분류의 작업을 해명하는데 성공적으로 적용되었다. 반대로 부분 기반 표현법은 뇌에서 물체를 인식하는 방법을 이해하는 또 다른 방법이다. 이 논문에서, 우리는 소리 분류의 작업에 부분기반 표현법을 학습시키는 비음수화 행렬 분해(NMF)(1) 방법을 적용하였다. 잡음이 존재할 때와 존재하지 않을 때 두 가지 상황에서, NMF를 이용하여 주파수-시간영역의 소리로부터 특징을 추출하는 방법을 설명한다. 실험결과에서는 NMF에 기반을 둔 특징이 ICA에 기반을 두어 추출한 특징보다 소리 분류의 성능을 향상시킴을 보여준다.

Keywords

References

  1. D. D. Lee and H. S. Seung, 'Learning the parts of objects by non-negative matrix factorization,' Nature, Vol. 40, pp. 788-791, Oct. 1999 https://doi.org/10.1038/44565
  2. M. Casey, 'Sound classification and similarity tools,' in Introduction to MPEG-7: Multimedia Content Description Language, B. S. Manjunath, P. Salembier, and T. Sikora, Eds. John Wiley & Sons, Inc., 2001
  3. M. Casey, 'Reduced-rank spectra and minimum-entropy priors as consistent and reliable cues for generalized sound recognition,' in Proc. Workshop on Consistent and Reliable Acoustic Cues for Sound Analysis, Eurospeech, Aalborg, Denmark, 2001
  4. A. Hyvsrinen, J. Karhunen, and E. Oja, Independent Component Analysis, John Wiley & Sons, Inc., 2001
  5. A. Cichocki and S. Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications, John Wiley & Sons, Inc., 2002
  6. A. Bell and T. Sejnowski, 'Learning the higherorder structure of a natural sound,' Network: Computation in Neural Systems, Vol. 7, pp. 261-266, 1996 https://doi.org/10.1088/0954-898X/7/2/005
  7. M. S. Lewicki, 'Efficient coding of natural sounds,' Nature Neuroscience, Vol. 5, No.4, pp. 356- 363, 2002 https://doi.org/10.1038/nn831
  8. K. P. Kording, P. Konig, and D. J. Klein, 'Learning of sparse auditory receptive fields,' in Proc. IJCNN, Honolulu, Hawaii, 2002 https://doi.org/10.1109/IJCNN.2002.1007648
  9. B. A. Olshausen and D. J. Field, 'Sparse coding with an overcomplete basis set: A strategy employed by V1,' Vision Research, Vol. 37, pp. 3311-3325, 1997 https://doi.org/10.1016/S0042-6989(97)00169-7
  10. D. A. Depireux, J. Z. Simon, D. J. Klein, and S. A. Shamma, 'Spectro-ternporal response field characterization with dynamic ripples in ferret primary auditory cortex,' J. Neuro-physiology, Vol. 85, pp. 1220-1234, 2001
  11. S. Shamma, 'On the role of space and time in auditory processing,' TRENDS in Cognitive Science, Vol. 5, No.8, pp. 340-348, 2001 https://doi.org/10.1016/S1364-6613(00)01704-6
  12. M. S. Gazzaniga, R. B. Ivry, and G. R. Mangum, Cognitive Neuroscience: The Biology of the Mind, W. W. Norton & Company, New York, 2001
  13. D. D. Lee and H. S. Seung, 'Algorithms for non-negative matrix factorization,' in Advances in Neural Information Processing Systems, Vol. 13, 2001
  14. L. R. Rabiner and B. H. Juang, 'An introduction to hidden Markov models,' IEEE trans, Acoustics, Speech and Signal Processing Magazine, Vol 3, pp. 4-16, 1986