References
- Li, J., Deng, L., Gong, Y., & Haeb-Umbach, R. (2014). An overview of noise-robust automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(4), 745-777. https://doi.org/10.1109/TASLP.2014.2304637
- Atal, B. (1974). Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. Journal of the Acoustical Society of America, 55(6), 1304-1312. https://doi.org/10.1121/1.1914702
- Viikki, O., Bye, D., & Laurila, K. (1998). A recursive feature vector normalization approach for robust speech recognition in noise. Proceedings of the IEEE ICASSP (pp. 733-736).
- Alam, J., Ouellet, P., Kenny, P., & O'Shaughnessy, D. (2011). Comparative evaluation of feature normalization techniques for speaker verification. Proceedings of the International Conference on Nonlinear Speech Processing (pp. 246-253).
- Molau, S., Hilger, F., & Ney, H. (2003). Feature space normalization in adverse acoustic conditions. Proceedings of the IEEE ICASSP (pp. 656-659).
- Choi, B., Ban, S., & Kim, H. (2015). Pole-filtered cepstral normalization methods for robust speech recognition. Proceedings of the 2015 Spring Conference of the Korean Society of Speech Sciences (pp. 101-102). (최보경.반성민.김형순 (2015). 강인한 음성인식을 위한 극점 필터링된 켑스트럼 정규화 방식. 한국음성학회 2015 봄학술대회 논문집, 101-102.)
- Choi, B., Ban, S., & Kim, H. (2015). Cepstral feature normalization methods using pole filtering and scale normalization for robust speech recognition. The Journal of the Acoustical Society of Korea, 34(4), 316-320. (최보경.반성민.김형순 (2015). 강인한 음성인식을 위한 극점 필터링 및 스케일정규화를 이용한 켑스트럼 특징 정규화 방식. 한국음향학회지, 34(4), 316-320.) https://doi.org/10.7776/ASK.2015.34.4.316
- Naik, D. (1995). Pole-filtered cepstral mean subtraction. Proceedings of the IEEE ICASSP (pp. 157-160).
- Schroeder, M. R. (1981). Direct (nonrecursive) relations between cepstrum and predictor coefficients. IEEE Transactions on Acoustics, Speech, and Signal Processing, 29(2), 297-301. https://doi.org/10.1109/TASSP.1981.1163546
- Hirsch, H. G., & Pearce, D. (2000). The aurora experimental framework for the performance evaluations of speech recognition systems under noisy conditions. Proceedings of the ASR2000-Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (pp. 181-188).
- Acero, A., & Huang, X. (1995). Augmented cepstral normalization for robust speech recognition. Proceedings of the IEEE Automatic Speech Recognition Workshop (pp. 146-147).
- Compernolle, D. V. (1989). Noise adaptation in a hidden Markov model speech recognition system. Computer Speech and Language, 3(2), 151-167. https://doi.org/10.1016/0885-2308(89)90027-2
- Ying, D., Yan, Y., Dang, J., & Soong, F. K. (2011). Voice activity detection based on an unsupervised learning framework. IEEE Transactions on Audio, Speech, and Language Processing, 19(8), 2624-2633. https://doi.org/10.1109/TASL.2011.2125953
- ETSI Standard (2003). Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced frontend feature extraction algorithm; compression algorithms. ETSI Technical Report ES 202 050, 1.1.3.
- Abdel-Hamid, O., Mohamed, A., Jiang, H., Deng, L., Penn, G., & Yu, D. (2014). Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(10), 1533-1545. https://doi.org/10.1109/TASLP.2014.2339736