Monosyllable Speech Recognition through Facial Movement Analysis |
Kang, Dong-Won
(Dept. of Biomedical Engineering, Konkuk University)
Seo, Jeong-Woo (Dept. of Biomedical Engineering, Konkuk University) Choi, Jin-Seung (Dept. of Biomedical Engineering, Konkuk University) Choi, Jae-Bong (Department of Mechanical Systems Engineering, Hansung University) Tack, Gye-Rae (Dept. of Biomedical Engineering, BK21+ Research Institute of Biomedical Engineering, Konkuk University) |
1 | T. Chen, H. P. Graf, and K. Wang, "Speech-assisted video processing: Interpolation and low-bitrate coding," 28th Annual Asilomar Conference on Signals, Systems, and Computers (Asilomar '94), pp. 957-979, 1994. |
2 | G. Baily, E. Vatikiotis-Bateson, and P. Perrier, Visual and audio-visual speech processing, MIT press, 2004. |
3 | A. Bagai, H. Gandhi, R. Goyal, M. Kohli, and T. V. Prasad, "Lip-Reading using Neural Networks, International Journal of Computer Science and Network Security," vol. 9, no. 4, pp. 108-111, 2009. |
4 | H. Mehrotra, G. Agrawal, and M. C. Srivastava, "Automatic Lip Contour Tracking and Visual Character Recognition for Computerized Lip Reading," International Journal of Computer Science, vol. 4, no. 1, pp. 62-71, 2009. |
5 | W. J. Ma, X. Zhou, L. A. Ross, J. J. Foxe, and L. C. Parra, "Lip reading aids word recognition most in moderate noise: a Bayesian explanation using highdimensional feature space," PLoS ONE, vol. 4, no. 3, pp. 1-14, 2009. DOI ScienceOn |
6 | J. J. Shin, J. Lee, and D. J. Kim, "Real-time lip reading system for isolated Korean word recognition," Pattern Recognition, vol. 44, pp. 559-571, 2011. DOI ScienceOn |
7 | M. G. Song, T. P. Thanh, J. Y. Kim, and S.T. Hwang, "A Study on Lip Detection based on Eye Localization for Visual Speech Recognition in Mobile Environment," Journal of Korean institute of intelligent systems, vol. 19, no. 4, pp. 478-484, 2009. 과학기술학회마을 DOI ScienceOn |
8 | Y. T. Won, H. D. Kim, M. R. Lee, B. S. Jang, and H. S. Kwak, "A Character Speech Animation System for Language Education for Each Hearing Impaired Person, Journal of digital contents society, vol. 9, no. 3, pp. 389-398, 2008. 과학기술학회마을 |
9 | K. H. Lee, J. J. Kum, and S. B. Rhee, "Design & Implementation of Lipreading System using the Articulatory Controls Analysis of the Korean 5 Vowels," The Journal of Korean association of computer education, vol. 8, no. 4, pp. 281-288, 2007. 과학기술학회마을 |
10 | J. Ma, R. Cole, B. Pellom, W. Ward, and B. Wise, "Accurate Visible Speech Synthesis Based on Concatenating Variable Length Motion Capture Data," IEEE Transactions on Visualization and computer Graphics, vol. 12, no. 2, pp. 266-276, 2006. DOI ScienceOn |
11 | G. Bailly, F. Elisei, M. Odisio, D. Pele, D. Caillière, and K. Grein-Cochard, "Talking faces for MPEG-4 compliant scalable face-to-face telecommunication," Proceedings of the Smart Objects Conference (SOC '03), pp. 204-207, 2003. |
12 | P. Scanlon, and R. Reilly, "Feature analysis for automatic speech reading," Proc. of the IEEE Int. Conf. on Multimedia Signal Processing (MMSP '01), pp. 625-630, 2001. |
13 | R. SCOTT, "Sparking life: notes on the performance capture sessions for the lord of the rings: the two towers," ACM SIG-GRAPH Computer Graphics, vol. 37, no. 4, pp. 17-21, 2003. DOI |
14 | Y. Cao, P. Faloutsos, E. Kohler, and F. Pighin, "Real-time speech motion synthesis from recorded motions," In Proceedings of Eurographics/SIGGRAPH Symposium on Computer Animation (SCA '04), pp. 345-353, 2004. |
15 | S. Dupont, and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," In IEEE Trans-actions on Multimedia, vol. 2, pp. 141-151, 2000. DOI ScienceOn |
16 | C. G. Lee, I. M. So, Y. U. Kim, J. R. Kim, S. K. Kang, and S. T. Jung, "Implementation of three dimension lip reading system using stereo vision," Proceedings of Korea multimedia society conference (KMMS '04), pp. 489-492, 2004. |
17 | H. S. Koh, S. M. Han, J. U. Chu, S. H. Park, J. B. Choi, G. W. Choi, D. S. Hwang, and I. C. Youn, "The three-dimensional lip shape tracking system using stereo camera," Proceedings of the Korean Society of Precision Engineering (KSPE '11) Conference, pp. 979-980, 2011. 과학기술학회마을 |
18 | K. H. Lee, R. Yong, and S. O. Kim, "A study on speechreading the Korean 8 vowels," Journal of the Korea society of computer and information, vol. 14, no. 3, pp. 173-182, 2009. 과학기술학회마을 |
19 | J. Y. Kim, S. H. Min, and S. H. Choi, "Robustness of Bimodal Speech Recognition on Degradation of Lip Parameter Estimation Performance," Journal of the Korean Society of Phonetic Science and Speech Technology, vol. 10, no. 2, pp. 29-33, 2003. 과학기술학회마을 |
20 | K. H. Nam, and C. S. Bae, "A study on the lip shape recognition algorithm using 3-D Model," The Journal of the Korean Institute of Maritime Information & Communication Sciences, vol. 6, no. 5, pp. 783-788, 2002. 과학기술학회마을 |
21 | G. Galatas, G. Potamianos, D. Kosmopoulos, C. McMurrough, and F. Makedon, "Bilingual Corpus for AVASR using Multiple Sensors and Depth Information," Auditory-Visual Speech Processing (AVSP '11), pp. 103-106, 2011. |
22 | I. S. Pandzic, and R. Forchheimer, MPEG-4 Facial Animation: The Standard, Implementation, and Applications, John Wiley and Sons, Inc., New York, 2002. |
23 | A. Srinivasan, "Speech Recognition Using Hidden Markov Model," Applied Mathematical Sciences, vol. 5, no. 79, pp. 3943-3948, 2011. |
24 | D. A. Pierre, Optimization Theory with Applications, Dover Publications, Inc., New York, 1986. |
25 | N. Eveno, A. Capiler, and P. Y. Coulon, "Accurate and quasi-automatic lip tracking," IEEE Transactions of Circuits and Systems for Video Technology, vol. 14, no. 5, pp. 706-715, 2004. DOI ScienceOn |
26 | X. D. Huang, Y. Ariki, and M.A. Jack, Hidden Markov Models for Speech Recognition, Edinburgh Univ. Press, Edinburgh, 1990. |