An Analysis of Acoustic Features Caused by Articulatory Changes for Korean Distant-Talking Speech

  • Kim Sunhee (Dept. of Electrical Engineering and Computer Science, KAIST) ;
  • Park Soyoung (Dept. of Electrical Engineering and Computer Science, KAIST) ;
  • Yoo Chang D. (Dept. of Electrical Engineering and Computer Science, KAIST)
  • Published : 2005.06.01

Abstract

Compared to normal speech, distant-talking speech is characterized by the acoustic effect due to interfering sound and echoes as well as articulatory changes resulting from the speaker's effort to be more intelligible. In this paper, the acoustic features for distant-talking speech due to the articulatory changes will be analyzed and compared with those of the Lombard effect. In order to examine the effect of different distances and articulatory changes, speech recognition experiments were conducted for normal speech as well as distant-talking speech at different distances using HTK. The speech data used in this study consist of 4500 distant-talking utterances and 4500 normal utterances of 90 speakers (56 males and 34 females). Acoustic features selected for the analysis were duration, formants (F1 and F2), fundamental frequency, total energy and energy distribution. The results show that the acoustic-phonetic features for distant-talking speech correspond mostly to those of Lombard speech, in that the main resulting acoustic changes between normal and distant-talking speech are the increase in vowel duration, the shift in first and second formant, the increase in fundamental frequency, the increase in total energy and the shift in energy from low frequency band to middle or high bands.

Keywords

References

  1. M. Matassoni, M. Omologo, D. Giuliani, P. Svaizer, 'Hidden Markov model training with contaminated speech material for distant-talking speech recognition' Computer Speech & Language, 16 (2), 205-223, 2002 https://doi.org/10.1006/csla.2002.0191
  2. S. Koster, 'Acoustic Characteristics of Hyperarticulated speech for Different Speaking Style,' Proc. ICASSP, 2, 873-876, 2001
  3. J.-C. Junqua, 'The Influence of Acoustics on Speech Production: A Noise-Induced Stress Phenomenon as the Lombard Reflex,' Speech Communication, 20, 13-22, 1996 https://doi.org/10.1016/S0167-6393(96)00041-6
  4. J. Hansen, 'Analysis and Compensation of Speech under Stress and Noise for Environmental Robustness in Speech Recognition,' Speech Communication, 20, 151-173, 1996 https://doi.org/10.1016/S0167-6393(96)00050-7
  5. S. E. Bou-Ghazale, J. Hansen, 'A Comparative Study of Traditional and Newly Proposed Features for Recognition of Speech Under Stress,' IEEE Transactions on Speech and Audio Processing, 8-4, 429-442, 2000 https://doi.org/10.1109/89.848224
  6. A. Castellanos, J.-M. Benedi, F. Casacuberta, 'An Analysis of General Acoustic-Phonetic Features for Spanish Speech Produced with the Lombard Effect', Speech Communication, 20, 23-35, 1996 https://doi.org/10.1016/S0167-6393(96)00042-8
  7. S. Chi, Y-.H. Oh, 'Lombard Effect compensation and noise suppression for Noisy Lombard Speech Recognition', Proc. ICASSP, 2013-2016, 1996
  8. S.-Y. Woo, Robust feature extraction using Lombard effect compensation filter, Master's thesis, (KAIST, 2003)
  9. J. C. Junqua, Y. Anglade, 'Acoustic and perceptual studies of Lombard speech: Application to isolated words automatic speech recognition,' IEEE Trans., 1990