• Title/Summary/Keyword: Background noise level

Search Result 163, Processing Time 0.019 seconds

Design and Implementation of a Real-Time Lipreading System Using PCA & HMM (PCA와 HMM을 이용한 실시간 립리딩 시스템의 설계 및 구현)

  • Lee chi-geun;Lee eun-suk;Jung sung-tae;Lee sang-seol
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1597-1609
    • /
    • 2004
  • A lot of lipreading system has been proposed to compensate the rate of speech recognition dropped in a noisy environment. Previous lipreading systems work on some specific conditions such as artificial lighting and predefined background color. In this paper, we propose a real-time lipreading system which allows the motion of a speaker and relaxes the restriction on the condition for color and lighting. The proposed system extracts face and lip region from input video sequence captured with a common PC camera and essential visual information in real-time. It recognizes utterance words by using the visual information in real-time. It uses the hue histogram model to extract face and lip region. It uses mean shift algorithm to track the face of a moving speaker. It uses PCA(Principal Component Analysis) to extract the visual information for learning and testing. Also, it uses HMM(Hidden Markov Model) as a recognition algorithm. The experimental results show that our system could get the recognition rate of 90% in case of speaker dependent lipreading and increase the rate of speech recognition up to 40~85% according to the noise level when it is combined with audio speech recognition.

  • PDF

Voice Packet Processing Scheme for Voice Quality and Bandwidth Efficiency in VoIP (VoIP의 음성품질/대역효율 개선을 위한 음성패킷 처리)

  • Kim, Jae-Won;Sohn, Dong-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.896-904
    • /
    • 2004
  • In this paper, We present an efficient variable rate speech coder for spectral efficiency and packet processing technique for packet loss compensation of a voice codec with 10msec frame in VoIP service. Through disconnecting the users from the spectral resource during silence interval of about 60% period, a variable rate voice coder based on a voice activity detection(VAD) can increase spectral gain by two times. The performance of the method was analyzed by variation of detected voice activity factor and degraded speech frame ratio under various background noise level, and compared those of G.729B of ITU-T 8kbps standard speech codec. A method to compensate lost packets utilized addition of recovery data to a main stream and error concealment scheme for speech quality enhancement, the performance is verified by reconstructed speech quality. The proposed scheme can achieve spectral gain by two times or enhance speech quality by 3dB through reserved bandwidth of VAD. Therefore, the proposed method can enhance a spectral efficiency or speech quality of VoIP.

  • PDF

An Efficient Character Image Enhancement and Region Segmentation Using Watershed Transformation (Watershed 변환을 이용한 효율적인 문자 영상 향상 및 영역 분할)

  • Choi, Young-Kyoo;Rhee, Sang-Burm
    • The KIPS Transactions:PartB
    • /
    • v.9B no.4
    • /
    • pp.481-490
    • /
    • 2002
  • Off-line handwritten character recognition is in difficulty of incomplete preprocessing because it has not dynamic information has various handwriting, extreme overlap of the consonant and vowel and many error image of stroke. Consequently off-line handwritten character recognition needs to study about preprocessing of various methods such as binarization and thinning. This paper considers running time of watershed algorithm and the quality of resulting image as preprocessing for off-line handwritten Korean character recognition. So it proposes application of effective watershed algorithm for segmentation of character region and background region in gray level character image and segmentation function for binarization by extracted watershed image. Besides it proposes thinning methods that effectively extracts skeleton through conditional test mask considering routing time and quality of skeleton, estimates efficiency of existing methods and this paper's methods as running time and quality. Average execution time on the previous method was 2.16 second and on this paper method was 1.72 second. We prove that this paper's method removed noise effectively with overlap stroke as compared with the previous method.