Search | Korea Science

오현화;김인철;김동수;진성일
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.1
- /
- pp.19-26
- /
- 2002
This paper defines the visual basic speech units, visemes and investigates various visual features of a lip for the effective Korean lipreading. First, we analyzed the visual characteristics of the Korean vowels from the database of the lip image sequences obtained from the multi-speakers, thereby giving a definition of seven Korean vowel visemes. Various spatio-temporal features of a lip are extracted from the feature points located on both inner and outer lip contours of image sequences and their classification performances are evaluated by using a hidden Markov model based classifier for effective lipreading. The experimental results for recognizing the Korean visemes have demonstrated that the feature victor containing the information of inner and outer lip contours can be effectively applied to lipreading and also the direction and magnitude of the movement of a lip feature point over time is quite useful for Korean lipreading.
PDF KSCI

오현화;권홍석;손종목;진성일;배건성
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.40 no.5
- /
- pp.289-297
- /
- 2003
The performance of a bimodal system is affected by the accuracy of the endpoint detection from the input signal as well as the performance of the speech recognition or lipreading system. In this paper, we propose the endpoint detection method which detects the endpoints from the audio and video signal respectively and utilizes the signal to-noise ratio (SNR) estimated from the input audio signal to select the reliable endpoints to the acoustic noise. In other words, the endpoints are detected from the audio signal under the high SNR and from the video signal under the low SNR. Experimental results show that the bimodal system using the proposed endpoint detector achieves satisfactory recognition rates, especially when the acoustic environment is quite noisy.
PDF KSCI

오현화;김인철;김동수;진성일
- Proceedings of the IEEK Conference
- /
- 2001.06d
- /
- pp.29-32
- /
- 2001
Visual speech information improves the performance of speech recognition, especially in noisy environment. We have tested the various spatial-temporal features for the Korean lipreading and evaluated the performance by using a hidden Markov model based classifier. The results have shown that the direction as well as the magnitude of the movement of the lip contour over time is useful features for the lipreading.
PDF