Browse > Article
http://dx.doi.org/10.7776/ASK.2008.27.8.418

A Method of Sound Segmentation in Time-Frequency Domain Using Peaks and Valleys in Spectrogram for Speech Separation  

Lim, Sung-Kil (경희대학교 컴퓨터공학과)
Lee, Hyon-Soo (경희대학교 컴퓨터공학과)
Abstract
In this paper, we propose an algorithm for the frequency channel segmentation using peaks and valleys in spectrogram. The frequency channel segments means that local groups of channels in frequency domain that could be arisen from the same sound source. The proposed algorithm is based on the smoothed spectrum of the input sound. Peaks and valleys in the smoothed spectrum are used to determine centers and boundaries of segments, respectively. To evaluate a suitableness of the proposed segmentation algorithm before that the grouping stage is applied, we compare the synthesized results using ideal mask with that of proposed algorithm. Simulations are performed with mixed speech signals with narrow band noises, wide band noises and other speech signals.
Keywords
Frequency Channel Segmentation; Speech Separation; Sound Segmentation; Peak; Valley; Spectrogram;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Abdollahpouri, M., Khaki-Sedigh, A., Khaloozadeh, H., "A New Method for Active Noise Cancellation in the Presence of Three Unknown Moving Sources", AICMS 08, 1006-1011, 2008
2 Hu, G. and Wang, D.L., "Monaural speech segregation based on pitch tracking and amplitude modulation", Trans. on Neural Networks, V. 15, I 5, pp.1135-1150, 2004   DOI   ScienceOn
3 Bregman, A. S., "Auditory Scene Analysis : The Perceptual Organization of Sound, MIT Press", (1991)
4 Srinivasan, S.H. and Kankanhalli, M, "Harmonicity and dynamics based audio separation", ICASSP 03, 5, V-640-3, 2003
5 Dan Ellis, 'Computational Auditory Scene Analysis', Talk Slids, http://www.ee.columbia.edu/~dpwe/talks/oldenburg-casa-2005-06.pdf, 2005
6 Brown, G. J. and Wang, D. L. "The separation of speech from interfering sounds based on oscillatory correlation", Trans. on Neural Networks, 10, I. 3, pp.684-697, 1999   DOI   ScienceOn
7 Jin, C., van Schaik, A., Carlile, S.,"The integration of acoustical cues during human sound localisation of band-pass filtered noise", ICONIP 1999, 2, 483-488, 1999
8 Jen-Tzung Chien, Bo-Cheng Chen, "A new independent component analysis for speech recognition and separation", Trans. on Audio, Speech and Language Processing, V. 14, I. 4, pp.1245-1254, 2006   DOI   ScienceOn
9 Chan, C.F., Yu, E.W.M., "Improving pitch estimation for efficient multiband excitation coding of speech", Electronics Letters, 32, I. 10, pp.870-872, 1996   DOI   ScienceOn
10 Mohammed, U.S., Mahmmoud, M.F., "A Blind Signal Separation Technique using Combination of Second-Order and Higher-Order Approaches", ICICT 06, 1-2, 2007
11 Cook, M.P., "Modeling Auditory Processing and Organization", Cambridge Univ. Press, 1993
12 Walsh, J.M., Kim, Y.M., Doll, T.M., "Joint Iterative Multi-Speaker Identification and Source Separation using Expectation Propagation", ASPAA 2007, 283-286, 2007