References
- C. C. Chibelushi, F. Deravi, J. S. D. Mason, “A review of speech-based bimodal recognition,” IEEE. Trans. Multimedia, Vol. 4, No. 1, pp. 23-37, 2002 https://doi.org/10.1109/6046.985551
- H. Yao, W. Gao, W. Shan, and M. Xu, "Visual features extracting and selecting for lipreading," in Proc. Int. Conf. Audio- and Video-based Biometric Person Authentication, Guildford, UK, pp. 251-259, Jun. 2003
- 이종석, 심선희, 김소영, 박철훈, “제어되지 않은 조명 조건하 에서 입술 움직임의 강인한 특징추출을 이용한 바이모달 음성 인식,” Telecommunications Review, 제14권 제1호, pp. 123-134, 2004년 2월
- C. Bregler and Y. Konig, “Eigenlips for robust speech recognition,” in Proc. Int. Conf. Acoustics, Speech and Signal Processing, Vol. 2, Adelaide, Austria, pp. 669-672, 1994
- G. Potamianos, A. Verma, C. Neti, G. Iyengar, and S. Basu, “A cascade image transform for speaker independent automatic speechreading,” in Proc. Int. Conf. Multimedia and Expo, Vol. 2, New York, pp. 1097-1100, 2000
- P. Scanlon and R. Reilly, “Feature analysis for automatic speechreading,” in Proc. Int. Conf. Multimedia and Expo, Tokyo, Japan, pp. 625-630, Apr. 2001
- G. Potamianos and C. Neti, “Audio-visual speech recognition in challenging environments,” in Proc. Eurospeech, Geneva, Switzerland, pp. 1293-1296, Sep. 2003
- K. Saenko, T. Darrell, J. Glass, “Articulatory features for robust visual speech recognition,” in Proc. Int. Conf. Multimodal Interfaces, State College, PA, pp. 152-158, Oct. 2004
- A. Amer and E. Dubois, “Fast and reliable structure-oriented video noise estimation,” IEEE. Trans. Circuits and Systems for Video Technology, Vol. 15, No. 1, pp. 113-118, Jan. 2005 https://doi.org/10.1109/TCSVT.2004.837017
- 이종석, 박철훈, “시청각 음성인식을 위한 정보통합: 신뢰도 측정방식의 비교와 신경회로망을 이용한 통합 기법,” Telecommunications Review, 제17권 제3호, pp. 538-550, 2007년 6월
- J.-S. Lee and C. H. Park, “Training hidden Markov models by hybrid simulated annealing for visual speech recognition,” in Proc. Int. Conf. Systems, Man, and Cybernetics, pp. 198-202, Taipei, Taiwan, Oct. 2006
- R. C. Gonzalez and R. E. Woods, 'Digital Image Processing,' Prentice-Hall, Upper Saddle River, NJ, 2001
- S. Lucey, “An evaluation of visual speech features for the tasks of speech and speaker recognition,” in Proc. Int. Conf. Audio- and Video-based Biometric Person Authentication, Guildford, UK, pp. 260-267, Jun. 2003
- X. Huang, A. Acero, and H.-W. Hon, “Spoken Language Processing,” Prentice-Hall, Upper Saddle River, NJ, 2001
- J. J. Ohala, “The temporal regulation of speech,” in Auditory Analysis and Perception, eds., G. Fant and M. A. Tatham, Academic Press, London, UK, pp. 431-453, 1975
- K. Munhall and E. Vatikiotis-Bateson, “The moving face during speech communication,” in Hearing by Eye II: Advances in the Psychology of Speechreading and Audio-Visual Speech, eds., R. Campbell, B. Dodd, and D. Burnham, Psychology Press, Hove, UK, pp. 123-142, 1998
- J. G. Proakis and D. G. Manolakis, 'Digital signal processing,' Prentice-Hall, Upper Saddle River, NJ, 1996
- M. Vitkovitch and P. Barber, “Visible speech as a function of image quality: effects of display parameters on lipreading ability,” Applied Cognitive Psychology, Vol. 10, pp. 121-140, 1996 https://doi.org/10.1002/(SICI)1099-0720(199604)10:2<121::AID-ACP371>3.0.CO;2-V
Cited by
- Highly Reliable Fault Detection and Classification Algorithm for Induction Motors vol.18B, pp.3, 2011, https://doi.org/10.3745/KIPSTB.2011.18B.3.147