참고문헌
- Nefian, L. Laing, X. Pi, L. Xioxiang, C. Mao and K. Murphy, "Dynamic Bayesian Networks for Audio-Visual Speech Recognition," EURASIP Journal on Applied Signal Processing, vol. 1, pp. 1274 - 1288, 2002 https://doi.org/10.1155/S1110865702206083
- Petajan, E.D., "Automatic Lipreading to Enhance Speech Recognition," Proceedings of IEEE Conf. on Computer Vision and Pattern Recognition, pp. 40-47, 1985
- T. Chen, "Audiovisual speech processing," IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 9-21, 2001 https://doi.org/10.1109/79.911195
- P. Duchnowski, U. Meier, A. Waibel, "See Me, Hear Me: lntergrating Automatic Speech Recognition and Lipreading", Proceedings of ICSLP pp. 547-550, 1994
- G. Potamianos, C. Neti, J. Luettin, and I. Matthews, “Audio-Visual Automatic Speech Recognition: An Overview,” in Issues in Visual and Audio-Visual Speech Processing, G. Bailly, E. Vatikiotis-Bateson, and P. Perrier (Eds.), MIT Press, Boston, 2004
- F. Berthommier, H, Glotin, "A new SNR-feature mapping for robust multistream speech recognition," Proceedings of International Congress on Phonetic Sciences (ICPhS), vol. 1, pp. 711-715, San Francisco, 1999
- Md. J. Alam, Md. F. Chowdhury, Md. F. Alam, "Comparative Study of A Priori Signal-To Noise Ratio (SNR) Estimation Approaches for Speech Enhancement", Journal of Electrical & Electronics Engineering, vol. 9, no. 1, pp. 809-817, 2009
- A. Rogozan, P. Del'eglise, and M. Alissali, “Adaptive determination of audio and visual weights for automatic speech recognition,” Proceedings of European Tutorial Workshop on Audio-Visual Speech Processing (AVSP), pp. 61 - 64, 1997
- H. Glotin, D. Vergyri, C. Neti, G. Potamianos, and J. Luettin, "Weighting schemes for audio-visual fusion in speech recognition," Proceedings of IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 1, pp. 173 - 176, 2001 https://doi.org/10.1109/ICASSP.2001.940795
- M. Heckmann, F. Berthommier and K. Kroschel, "Noise Adaptive Stream Weighting in Audio-Visual Speech Recognitions," EURASIP Journal on Applied Signal Processing, vol. 2002, no. 1, pp. 1260 - 1273, 2002 https://doi.org/10.1155/S1110865702206150
- M. Gurban and J.-Ph. Thiran, " Using Entropy as a Stream Reliability Estimate for Audio-Visual Speech Recognition," Proceedings of 16th European Signal Processing Conference, Lausanne, Switzerland, August pp. 25-29, 2008
- J.-S. Lee and C. H. Park, "Adaptive Decision Fusion for Audio-Visual Speech Recognition," in Speech Recognition, Technology and Applications, I-Tech, Vienna, Austria, pp. 275-296, 2008
- J. Kennedy, and R. Eberhart, "Particle Swarm Optimization," Proceedings of the IEEE Int. Conf. on Neural Networks, Piscataway, NJ, pp. 1942 - 1948, 1995
- Kuliback, S; Leibler, R.A, "On Information and Sufficiency," The Annals of Mathematical Statistics, vol. 22 (1): pp. 79 - 86, 1951 https://doi.org/10.1214/aoms/1177729694
- A. Bhattacharyya, “On a Measure of Divergence between Two Statistical Populations Defined by Probability Distributions,” Bull. Calcutta Math. Soc., vol. 35, pp. 99 - 109, 1943
- Printz et al., “Theory and Practice of Acoustic Confusability”, Proceedings of the ISCA ITRW ASR2000, pp. 77-84, Paris, France, Sep. 18-20, 2000 https://doi.org/10.1006/csla.2001.0188
- John Hershey and Peder Olsen, “Approximating the Kullback LeibIer divergence between gaussian mixture models,” Proceedings of ICASSP 2007, Honolulu, Hawaii, April 2007
- J.R. Hershey, P.A. Olsen, "Variational Bhattacharyya Divergence for Hidden Markov Models", Proceedings of ICASSP 2008, pp. 4557-4560, 2008
- John R. Hershey, Peder A. Olsen, and Steven J. Rennie, "Variational Kullback Leibler Divergence for Hidden Markov Models," Proceedings of ASRU, Kyoto, Japan, pp. 323-328, December 2007. https://doi.org/10.1109/ASRU.2007.4430132
- Jia-Yu Chen, Peder Olsen, and John Hershey, "Word Confusability - Measuring Hidden Markov Model Similarity," Proceedings of Interspeech 2007 pp. 2089-2092, August 2007
- J. Silva and S. Narayanan, "Average Divergence Distance as a Statistical Discrimination Measure for Hidden Markov Models," IEEE Transactions on Audio, Speech and Language Processing, vol. 14, issue 3, pp. 890-906, May 2006 https://doi.org/10.1109/TSA.2005.858059
- http://www.speech.cs.cmu.edu/comp.speech/Section1/Data/noisex.html