• Title/Summary/Keyword: 연속 HMM

Search Result 150, Processing Time 0.027 seconds

Performance Improvement of Cardiac Disorder Classification Based on Automatic Segmentation and Extreme Learning Machine (자동 분할과 ELM을 이용한 심장질환 분류 성능 개선)

  • Kwak, Chul;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.1
    • /
    • pp.32-43
    • /
    • 2009
  • In this paper, we improve the performance of cardiac disorder classification by continuous heart sound signals using automatic segmentation and extreme learning machine (ELM). The accuracy of the conventional cardiac disorder classification systems degrades because murmurs and click sounds contained in the abnormal heart sound signals cause incorrect or missing starting points of the first (S1) and the second heart pulses (S2) in the automatic segmentation stage, In order to reduce the performance degradation due to segmentation errors, we find the positions of the S1 and S2 pulses, modify them using the time difference of S1 or S2, and extract a single period of heart sound signals. We then obtain a feature vector consisting of the mel-scaled filter bank energy coefficients and the envelope of uniform-sized sub-segments from the single-period heart sound signals. To classify the heart disorders, we use ELM with a single hidden layer. In cardiac disorder classification experiments with 9 cardiac disorder categories, the proposed method shows the classification accuracy of 81.6% and achieves the highest classification accuracy among ELM, multi-layer perceptron (MLP), support vector machine (SVM), and hidden Markov model (HMM).

On the Development of a Continuous Speech Recognition System Using Continuous Hidden Markov Model for Korean Language (연속분포 HMM을 이용한 한국어 연속 음성 인식 시스템 개발)

  • Kim, Do-Yeong;Park, Yong-Kyu;Kwon, Oh-Wook;Un, Chong-Kwan;Park, Seong-Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1
    • /
    • pp.24-31
    • /
    • 1994
  • In this paper, we report on the development of a speaker independent continuous speech recognition system using continuous hidden Markov models. The continuous hidden Markov model consists of mean and covariance matrices and directly models speech signal parameters, therefore does not have quantization error. Filter bank coefficients with their 1st and 2nd-order derivatives are used as feature vectors to represent the dynamic features of speech signal. We use the segmental K-means algorithm as a training algorithm and triphone as a recognition unit to alleviate performance degradation due to coarticulation problems critical in continuous speech recognition. Also, we use the one-pass search algorithm that Is advantageous in speeding-up the recognition time. Experimental results show that the system attains the recognition accuracy of $83\%$ without grammar and $94\%$ with finite state networks in speaker-indepdent speech recognition.

  • PDF

HMM-based Intent Recognition System using 3D Image Reconstruction Data (3차원 영상복원 데이터를 이용한 HMM 기반 의도인식 시스템)

  • Ko, Kwang-Enu;Park, Seung-Min;Kim, Jun-Yeup;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.135-140
    • /
    • 2012
  • The mirror neuron system in the cerebrum, which are handled by visual information-based imitative learning. When we observe the observer's range of mirror neuron system, we can assume intention of performance through progress of neural activation as specific range, in include of partially hidden range. It is goal of our paper that imitative learning is applied to 3D vision-based intelligent system. We have experiment as stereo camera-based restoration about acquired 3D image our previous research Using Optical flow, unscented Kalman filter. At this point, 3D input image is sequential continuous image as including of partially hidden range. We used Hidden Markov Model to perform the intention recognition about performance as result of restoration-based hidden range. The dynamic inference function about sequential input data have compatible properties such as hand gesture recognition include of hidden range. In this paper, for proposed intention recognition, we already had a simulation about object outline and feature extraction in the previous research, we generated temporal continuous feature vector about feature extraction and when we apply to Hidden Markov Model, make a result of simulation about hand gesture classification according to intention pattern. We got the result of hand gesture classification as value of posterior probability, and proved the accuracy outstandingness through the result.

On the Development of a Large-Vocabulary Continuous Speech Recognition System for the Korean Language (대용량 한국어 연속음성인식 시스템 개발)

  • Choi, In-Jeong;Kwon, Oh-Wook;Park, Jong-Ryeal;Park, Yong-Kyu;Kim, Do-Yeong;Jeong, Ho-Young;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.5
    • /
    • pp.44-50
    • /
    • 1995
  • This paper describes a large-vocabulary continuous speech recognition system using continuous hidden Markov models for the Korean language. To improve the performance of the system, we study on the selection of speech modeling units, inter-word modeling, search algorithm, and grammars. We used triphones as basic speech modeling units, generalized triphones and function word-dependent phones are used to improve the trainability of speech units and to reduce errors in function words. Silence between words is optionally inserted by using a silence model and a null transition. Word pair grammar and bigram model based oil word classes are used. Also we implement a search algorithm to find N-best candidate sentences. A postprocessor reorders the N-best sentences using word triple grammar, selects the most likely sentence as the final recognition result, and finally corrects trivial errors related with postpositions. In recognition tests using a 3,000-word continuous speech database, the system attained $93.1\%$ word recognition accuracy and $73.8\%$ sentence recognition accuracy using word triple grammar in postprocessing.

  • PDF

Effective Syllable Modeling for Korean Speech Recognition Using Continuous HMM (연속 은닉 마코프 모델을 이용한 한국어 음성 인식을 위한 효율적 음절 모델링)

  • 김봉완;이용주
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.23-27
    • /
    • 2003
  • Recently attempts to we the syllable as the recognition unit to enhance performance in continuous speech recognition hate been reported. However, syllables are worse in their trainability than phones and the former have a disadvantage in that contort-dependent modeling is difficult across the syllable boundary since the number of models is much larger for syllables than for phones. In this paper, we propose a method to enhance the trainability for the syllables in Korean and phoneme-context dependent syllable modeling across the syllable boundary. An experiment in which the proposed method is applied to word recognition shows average 46.23% error reduction in comparison with the common syllable modeling. The right phone dependent syllable model showed 16.7% error reduction compared with a triphone model.

Speaker Adaptation Using Neural Network in Continuous Speech Recognition (연속 음성에서의 신경회로망을 이용한 화자 적응)

  • 김선일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.11-15
    • /
    • 2000
  • Speaker adaptive continuous speech recognition for the RM speech corpus is described in this paper. Learning of hidden markov models for the reference speaker is performed for the training data of RM corpus. For the evaluation, evaluation data of RM corpus are used. Parts of another training data of RM corpus are used for the speaker adaptation. After dynamic time warping of another speaker's data for the reference data is accomplished, error back propagation neural network is used to transform the spectrum between speakers to be recognized and reference speaker. Experimental results to get the best adaptation by tuning the neural network are described. The recognition ratio after adaptation is substantially increased 2.1 times for the word recognition and 4.7 times for the word accuracy for the best.

  • PDF

Gesture Recognition using Training-effect on image sequences (연속 영상에서 학습 효과를 이용한 제스처 인식)

  • 이현주;이칠우
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.222-225
    • /
    • 2000
  • Human frequently communicate non-linguistic information with gesture. So, we must develop efficient and fast gesture recognition algorithms for more natural human-computer interaction. However, it is difficult to recognize gesture automatically because human's body is three dimensional object with very complex structure. In this paper, we suggest a method which is able to detect key frames and frame changes, and to classify image sequence into some gesture groups. Gesture is classifiable according to moving part of body. First, we detect some frames that motion areas are changed abruptly and save those frames as key frames, and then use the frames to classify sequences. We symbolize each image of classified sequence using Principal Component Analysis(PCA) and clustering algorithm since it is better to use fewer components for representation of gestures. Symbols are used as the input symbols for the Hidden Markov Model(HMM) and recognized as a gesture with probability calculation.

  • PDF

The Comparison of Speaker Adaptation Methods (화자 적응 방법들의 비교)

  • 황영수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.61-66
    • /
    • 1999
  • In this paper, we proposed various speaker adaptation methods and studied the performance of these methods. Methods which were studied in this paper are MAPE(Maximum A Posteriori Probability Estimation), Linear Spectral Estimating, Multi-Layer Perceptron and ARTMAP. In order to evaluate the performance of these methods, we used Korean isolated digits as the experimental data, the hybrid speaker adaptation method, which unified MAPE, linear spectral estimating and output probability of SCHMM, showed the better recognition result than those which performed other methods. And the method using ARTMAP showed the similar result to above hybrid method.

  • PDF

The Implementation of Automatic Segmentation and Labelling System Using Context-dependent Demi-phone (문맥종속 반음소단의 모델을 이용한 자동 음소분할 및 레이블링 시스템의 구현)

  • 김태환
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.351.2-356
    • /
    • 1998
  • 음소 단위로 레이블링된 데이터베이스는 음성연구에 있어 매우 중요하다. 그러나 수작업에 의한 음소분할 및 레이블링 작업은 많은 시간과 노력이 필요하기 때문에 자동 음소분할 및 레이블링 시스템에 대한 많은 연구가 진행되고 있다. 본 논문에서는 monophone과 triphone의 장점을 포함하는 문맥 종속 반음소 단위 모델을 이용한 자동 음소분할 및 레이블링 시스템을 구현하였다. 레이블링 단위로는 68개의 유사음소와 묵음 등 총 69개로 정하였으며, 음소 모델링은 연속 HMM을 사용하였다. 기존의 subword 단위모델과 본 논문에서 제안한 문맥종속 반음소 모델을 이용한 자동 음소분할 및 레이블링 시스템의 성능 비교 음소경계오차가 10ms 이내인 경우 각각 60.17%, 66.32%를 포함하여 6.15%의 향상을 보이고, 40ms 이내인 경우 90.36%, 94.27%를 포함하여 3.92%의 성능향상을 보였다.

  • PDF

A Study on the Variable Vocabulary Speech Recognition in the Vocabulary-Independent Environments (어휘독립 환경에서의 가변어휘 음성인식에 관한 연구)

  • 황병한
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.369-372
    • /
    • 1998
  • 본 논문은 어휘독립(Vocabulary-Independent) 환경에서 별도의 훈련과정 없이 인식대상 어휘를 추가 및 변경할 수 있는 가변어휘(Variable Vocabulary) 음성인식에 관한 연구를 다룬다. 가변어휘 인식은 처음에 대용량 음성 데이터베이스(DB)로 음소모델을 훈련하고 인식대상 어휘가 결정되면 발음사전에 의거하여 음소모델을 연결함으로써 별도의 훈련과정 없이 인식대상 어휘를 변경 및 추가할 수 있다. 문맥 종속형(Context-Dependent) 음소 모델인 triphone을 사용하여 인식실험을 하였고, 인식성능의 비교를 위해 어휘종속 모델을 별도로 구성하여 인식실험을 하였다. Unseen triphone 문제와 훈련 DB의 부족으로 인한 모델 파라메터의 신뢰성 저하를 방지하기 위해 state-tying 방법 중 음성학적 지식에 기반을 둔 tree-based clustering(TBC) 기법[1]을 도입하였다. Mel Frequency Cepstrum Coefficient(MFCC)와 대수에너지에 기반을 둔 3 가지 음성특징 벡터를 사용하여 인식 실험을 병행하였고, 연속 확률분포를 가지는 Hidden Markov Model(HMM) 기반의 고립단어 인식시스템을 구현하였다. 인식 실험에는 22 개 부서명 DB[3]를 사용하였다. 실험결과 어휘독립 환경에서 최고 98.4%의 인식률이 얻어졌으며, 어휘종속 환경에서의 인식률 99.7%에 근접한 성능을 보였다.

  • PDF