• Title/Summary/Keyword: HMM(HMM)

Search Result 963, Processing Time 0.035 seconds

Speaker-Independent Korean Digit Recognition Using HCNN with Weighted Distance Measure (가중 거리 개념이 도입된 HCNN을 이용한 화자 독립 숫자음 인식에 관한 연구)

  • 김도석;이수영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.10
    • /
    • pp.1422-1432
    • /
    • 1993
  • Nonlinear mapping function of the HCNN( Hidden Control Neural Network ) can change over time to model the temporal variability of a speech signal by combining the nonlinear prediction of conventional neural networks with the segmentation capability of HMM. We have two things in this paper. first, we showed that the performance of the HCNN is better than that of HMM. Second, the HCNN with its prediction error measure given by weighted distance is proposed to use suitable distance measure for the HCNN, and then we showed that the superiority of the proposed system for speaker-independent speech recognition tasks. Weighted distance considers the differences between the variances of each component of the feature vector extraced from the speech data. Speaker-independent Korean digit recognition experiment showed that the recognition rate of 95%was obtained for the HCNN with Euclidean distance. This result is 1.28% higher than HMM, and shows that the HCNN which models the dynamical system is superior to HMM which is based on the statistical restrictions. And we obtained 97.35% for the HCNN with weighted distance, which is 2.35% better than the HCNN with Euclidean distance. The reason why the HCNN with weighted distance shows better performance is as follows : it reduces the variations of the recognition error rate over different speakers by increasing the recognition rate for the speakers who have many misclassified utterances. So we can conclude that the HCNN with weighted distance is more suit-able for speaker-independent speech recognition tasks.

  • PDF

Face Detection using Ellipse fitting and HMM Face Recognition (Ellipse fitting을 이용한 얼굴 검출 및 HMM 얼굴 인식)

  • 이주영;남궁재찬
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.11a
    • /
    • pp.204-207
    • /
    • 2003
  • 실시간으로 배경에서 분리된 정확한 얼굴 영역을 찾아내는 것은 인식의 가장 기본적인 선행과제이다 얼굴을 찾기 위한 방법 중에 특징기반의 모서리(edge) 정보의 추출과 ellipse fitting 알고리즘을 이용하여 배경으로부터 얼굴을 효과적으로 분리해낸다. 얼굴인식을 하기 위한 얼굴 데이터베이스를 선처리 되어진 배경과 분리된 영상이 검출 된다.

  • PDF

HMM Based Endpoint Detection for Speech Signals

  • Lee Yonghyung;Oh Changhyuck
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.75-76
    • /
    • 2001
  • An endpoint detection method for speech signals utilizing hidden Markov model(HMM) is proposed. It turns out that the proposed algorithm is quite satisfactory to apply isolated word speech recognition.

  • PDF

Depth Image Poselets via Body Part-based Pose and Gesture Recognition (신체 부분 포즈를 이용한 깊이 영상 포즈렛과 제스처 인식)

  • Park, Jae Wan;Lee, Chil Woo
    • Smart Media Journal
    • /
    • v.5 no.2
    • /
    • pp.15-23
    • /
    • 2016
  • In this paper we propose the depth-poselets using body-part-poses and also propose the method to recognize the gesture. Since the gestures are composed of sequential poses, in order to recognize a gesture, it should emphasize to obtain the time series pose. Because of distortion and high degree of freedom, it is difficult to recognize pose correctly. So, in this paper we used partial pose for obtaining a feature of the pose correctly without full-body-pose. In this paper, we define the 16 gestures, a depth image using a learning image was generated based on the defined gestures. The depth poselets that were proposed in this paper consists of principal three-dimensional coordinates of the depth image and its depth image of the body part. In the training process after receiving the input defined gesture by using a depth camera in order to train the gesture, the depth poselets were generated by obtaining 3D joint coordinates. And part-gesture HMM were constructed using the depth poselets. In the testing process after receiving the input test image by using a depth camera in order to test, it extracts foreground and extracts the body part of the input image by comparing depth poselets. And we check part gestures for recognizing gesture by using result of applying HMM. We can recognize the gestures efficiently by using HMM, and the recognition rates could be confirmed about 89%.

Emotion recognition in speech using hidden Markov model (은닉 마르코프 모델을 이용한 음성에서의 감정인식)

  • 김성일;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.3
    • /
    • pp.21-26
    • /
    • 2002
  • This paper presents the new approach of identifying human emotional states such as anger, happiness, normal, sadness, or surprise. This is accomplished by using discrete duration continuous hidden Markov models(DDCHMM). For this, the emotional feature parameters are first defined from input speech signals. In this study, we used prosodic parameters such as pitch signals, energy, and their each derivative, which were then trained by HMM for recognition. Speaker adapted emotional models based on maximum a posteriori(MAP) estimation were also considered for speaker adaptation. As results, the simulation performance showed that the recognition rates of vocal emotion gradually increased with an increase of adaptation sample number.

  • PDF

A Study on Improved Method of Voice Recognition Rate (음성 인식률 개선방법에 관한 연구)

  • Kim, Young-Po;Lee, Han-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.1
    • /
    • pp.77-83
    • /
    • 2013
  • In this paper, we suggested a method about the improvement of the voice recognition rate and carried out a study on it. In general, voices were detected by applying the most widely-used method, HMM (Hidden Markov Model) algorithm. Regarding the method of detecting voices, the zero crossing ratio was calculated based on the units of voices before the existence of data was identified. Regarding the method of recognizing voices, the patterns shown by the forms of voices were analyzed before they were compared to the patterns which had already been learned. According to the results of the experiment, in comparison with the recognition rate of 80% shown by the existing HMM algorithm, the suggested algorithm based on the recognition of the patterns shown by the forms of voices showed the recognition rate of 92%, reflecting the recognition rate improved by about 12% compared to the existing one.

Eye Pattern Detection Using SVD and HMM Technique from CCD Camera Face Image (CCD 카메라 얼굴 영상에서의 SVD 및 HMM 기법에 의한 눈 패턴 검출)

  • Jin, Kyung-Chan;Miche, Pierre;Park, Il-Yong;Sohn, Byung-Gi;Cho, Jin-Ho
    • Journal of Sensor Science and Technology
    • /
    • v.8 no.1
    • /
    • pp.63-68
    • /
    • 1999
  • We proposed a method of eye pattern detection in the 2-D image which was obtained by CCD video camera. To detect face region and eye pattern, we proposed pattern search network and batch SVD algorithm which had the statistical equivalence of PCA. We also used HMM to improve the accuracy of detection. As a result, we acknowledged that the proposed algorithm was superior to PCA pattern detection algorithm in computational cost and accuracy of defection. Furthermore, we evaluated that the proposed algorithm was possible in real-time face pattern detection with 2 frame images per second.

  • PDF

ETRI small-sized dialog style TTS system (ETRI 소용량 대화체 음성합성시스템)

  • Kim, Jong-Jin;Kim, Jeong-Se;Kim, Sang-Hun;Park, Jun;Lee, Yun-Keun;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.217-220
    • /
    • 2007
  • This study outlines a small-sized dialog style ETRI Korean TTS system which applies a HMM based speech synthesis techniques. In order to build the VoiceFont, dialog-style 500 sentences were used in training HMM. And the context information about phonemes, syllables, words, phrases and sentence were extracted fully automatically to build context-dependent HMM. In training the acoustic model, acoustic features such as Mel-cepstrums, logF0 and its delta, delta-delta were used. The size of the VoiceFont which was built through the training is 0.93Mb. The developed HMM-based TTS system were installed on the ARM720T processor which operates 60MHz clocks/second. To reduce computation time, the MLSA inverse filtering module is implemented with Assembly language. The speed of the fully implemented system is the 1.73 times faster than real time.

  • PDF