Search | Korea Science

Park, Chul-Ho;Heo, Won-Chul;Bae, Keun-Sung
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.147-150
- /
- 2005
The performance of speech recognition in car environment is severely degraded when there is music or news coming from a radio or a CD player. Since reference signals are available from the audio unit in the car, it is possible to remove them with an adaptive filter. In this paper, we present experimental results of speech recognition in car environment using the echo canceller. For this, we generate test speech signals by adding music or news to the car noisy speech from Aurora2 DB. The HTK-based continuous HMT system is constructed for a recognition system. In addition, the MMSE-STSA method is used to the output of the echo canceller to remove the residual noise more.
PDF

Jeong, Ju-Hyun;Song, Hwa-Jeon;Kim, Hyung-Soon
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.59-62
- /
- 2005
Voice activity detection (VAD) is important in many areas of speech processing technology. Speech/nonspeech discrimination in noisy environments is a difficult task because the feature parameters used for the VAD are sensitive to the surrounding environments. Thus the VAD performance is severely degraded at low signal-to-noise ratios (SNRs). In this paper, a new VAD algorithm is proposed based on the degree of voicing and Quantile SNR (QSNR). These two feature parameters are more robust than other features such as energy and spectral entropy in noisy environments. The effectiveness of proposed algorithm is evaluated under the diverse noisy environments in the Aurora2 DB. According to out experiment, the proposed VAD outperforms the ETSI Advanced Frontend VAD.
PDF

Choi, Bo Kyeong;Ban, Sung Min;Kim, Hyung Soon
- The Journal of the Acoustical Society of Korea
- /
- v.34 no.4
- /
- pp.316-320
- /
- 2015
In this paper, the pole filtering concept is applied to the Mel-frequency cepstral coefficient (MFCC) feature vectors in the conventional cepstral mean normalization (CMN) and cepstral mean and variance normalization (CMVN) frameworks. Additionally, performance of the cepstral mean and scale normalization (CMSN), which uses scale normalization instead of variance normalization, is evaluated in speech recognition experiments in noisy environments. Because CMN and CMVN are usually performed on a per-utterance basis, in case of short utterance, they have a problem that reliable estimation of the mean and variance is not guaranteed. However, by applying the pole filtering and scale normalization techniques to the feature normalization process, this problem can be relieved. Experimental results using Aurora 2 database (DB) show that feature normalization method combining the pole-filtering and scale normalization yields the best improvements.
https://doi.org/10.7776/ASK.2015.34.4.316 인용 PDF KSCI