Browse > Article
http://dx.doi.org/10.5909/JBE.2015.20.1.164

Eigenvoice Adaptation of Classification Model for Binary Mask Estimation  

Kim, Gibak (School of Electrical Engineering, Soongsil University)
Publication Information
Journal of Broadcast Engineering / v.20, no.1, 2015 , pp. 164-170 More about this Journal
Abstract
This paper deals with the adaptation of classification model in the binary mask approach to suppress noise in the noisy environment. The binary mask estimation approach is known to improve speech intelligibility of noisy speech. However, the same type of noisy data for the test data should be included in the training data for building the classification model of binary mask estimation. The eigenvoice adaptation is applied to the noise-independent classification model and the adapted model is used as noise-dependent model. The results are reported in Hit rates and False alarm rates. The experimental results confirmed that the accuracy of classification is improved as the number of adaptation sentences increases.
Keywords
Noise reduction; Binary mask estimation; Environment adaptation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Kleinschmidt and V. Hohmann, “Sub-band SNR estimation using auditory feature processing,” Speech Communication, vol. 39, no. 1–2, pp. 47–63, Jan. 2003.   DOI   ScienceOn
2 Gibak Kim, “A Post-processing for Binary Mask Estimation Toward Improving Speech Intelligibility in Noise,” JBE, Vol. 18, No.2, pp.311-318, March, 2013.
3 N. Iwahashi, A. Kawasaki, “Speaker Adaptation in noisy environments based on parameter estimation using uncertain data,” In Proc. Intl. Conf. on Spoken Language Processing, Vol. 4, pp. 528-531, October 2000.
4 M. Kirby and L. Sirovich, “Application of the Karhunen–Loève procedure for the characterization of human faces,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 12, No. 1, pp. 103–108, January 1990..   DOI   ScienceOn
5 IEEE, “IEEE recommended practice for speech quality measurements",” IEEE Trans. Audio Electroacoust., vol. 17, pp. 225-246, 1969.   DOI
6 A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Commun., vol. 12, pp. 247-251, 1993.   DOI   ScienceOn
7 Y. Hu and P. C. Loizou, “Subjective comparison and evaluation of speech enhancement algorithms,” Speech communication, vol. 49, no. 7, pp. 588–601, Jul. 2007.   DOI   ScienceOn
8 Y. Hu and P. Loizou, “Evaluation of objective quality measures for speech enhancement,” IEEE Transactions on Speech and Audio Processing, vol. 16, no. 1, pp. 229–238, 2008.   DOI   ScienceOn
9 Y. Hu and P. C. Loizou, “A comparative intelligibility study of single-microphone noise reduction algorithms.” The Journal of the Acoustical Society of America, vol. 122, no. 3, p. 1777, Sep. 2007.   DOI   ScienceOn
10 D. S. Brungart, P. S. Chang, B. D. Simpson, and D. Wang, “Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation,” The Journal of the Acoustical Society of America, vol. 120, no. 6, p. 4007, 2006.   DOI   ScienceOn
11 J. Tchorz and B. Kollmeier, “Estimation of the signal-to-noise ratio with amplitude modulation spectrograms,” Speech Communication, vol. 38, no. 1–2, pp. 1–17, Sep. 2002.   DOI   ScienceOn
12 N. Li and P. C. Loizou, “Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.,” The Journal of the Acoustical Society of America, vol. 123, no. 3, pp. 1673–82, Mar. 2008.   DOI   ScienceOn
13 G. Kim, Y. Lu, Y. Hu, P. C. Loizou, “An algorithm that improves speech intelligibility in noise,” Journal of Acoustical Society of America, September 2009.
14 R. Kuhn, J. Junqua, P. Nguyen, N. Niedzielski, "Rapid Speaker Adaptation in Eigenvoice Space," IEEE Trans. Speech and Audio Proc., vol. 8, no. 6, pp. 695-707, November 2000.   DOI   ScienceOn
15 J. Tchorz and B. Kollmeier, “SNR estimation based on amplitude modulation analysis with applications to noise suppression,” IEEE Trans. on Speech and Audio Processing, vol. 11, no. 3, pp. 184–192, May 2003.   DOI   ScienceOn