• Title/Summary/Keyword: HMM(HMM)

Search Result 963, Processing Time 0.023 seconds

Gaussian Selection in HMM Speech Recognizer with PTM Model for Efficient Decoding (PTM 모델을 사용한 HMM 음성인식기에서 효율적인 디코딩을 위한 가우시안 선택기법)

  • 손종목;정성윤;배건성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.75-81
    • /
    • 2004
  • Gaussian selection (GS) is a popular approach in the continuous density hidden Markov model for fast decoding. It enables fast likelihood computation by reducing the number of Gaussian components calculated. In this paper, we propose a new GS method for the phonetic tied-mixture (PTM) hidden Markov models. The PTM model can represent each state of the same topological location with a shared set of Gaussian mixture components and contort dependent weights. Thus the proposed method imposes constraint on the weights as well as the number of Gaussian components to reduce the computational load. Experimental results show that the proposed method reduces the percentage of Gaussian computation to 16.41%, compared with 20-30% for the conventional GS methods, with little degradation in recognition.

Korean Word Recognition using the Transition Matrix of VQ-Code and DHMM (VQ코드의 천이 행렬과 이산 HMM을 이용한 한국어 단어인식)

  • Chung, Kwang-Woo;Hong, Kwang-Seok;Park, Byung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.4
    • /
    • pp.40-49
    • /
    • 1994
  • In this paper, we propose methods for improving the performance of word recognition system. The ray stratey of the first method is to apply the inertia to the feature vector sequences of speech signal to stabilize the transitions between VQ cdoes. The second method is generating the new observation probabilities using the transition matrix of VQ codes as weights at the observation probability of the output symbol, so as to take into account the time relation between neighboring frames in DHMM. By applying the inertia to the feature vector sequences, we can reduce the overlapping of probability distribution of the response paths for each word and stabilize state transitions in the HMM. By using the transition matrix of VQ codes as weights in conventional DHMM. we can divide the probability distribution of feature vectors more and more, and restrict the feature distribution to a suitable region so that the performance of recognition system can improve. To evaluate the performance of the proposed methods, we carried out experiments for 50 DDD area names. As a result, the proposed methods improved the recognition rate by $4.2\%$ in the speaker-dependent test and $12.45\%$ in the speaker-independent test, respectively, compared with the conventional DHMM.

  • PDF

HMM-based Speech Recognition using DMS Model and Fuzzy Concept (DMS 모델과 퍼지 개념을 이용한 HMM에 기초를 둔 음성 인식)

  • Ann, Tae-Ock
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.9 no.4
    • /
    • pp.964-969
    • /
    • 2008
  • This paper proposes a HMM-based recognition method using DMSVQ(Dynamic Multi-Section Vector Quantization) codebook by DMS(Dynamic Multi-Section) model and fuzzy concept, as a study for speaker- independent speech recognition. In this proposed recognition method, training data are divided into several dynamic section and multi-observation sequences which are given proper probabilities by fuzzy rule according to order of short distance from DMSVQ codebook per each section are obtained. Thereafter, the HMM using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. Other experiments to compare with the results of recognition experiments using proposed method are implemented as a data by the various conventional recognition methods under the equivalent environment. Through the experiment results, it is proved that the proposed method in this study is superior to the conventional recognition methods.

A Recognition Algorithm of Suspicious Human Behaviors using Hidden Markov Models in an Intelligent Surveillance System (지능형 영상 감시 시스템에서의 은닉 마르코프 모델을 이용한 특이 행동 인식 알고리즘)

  • Jung, Chang-Wook;Kang, Dong-Joong
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.11
    • /
    • pp.1491-1500
    • /
    • 2008
  • This paper proposes an intelligent surveillance system to recognize suspicious patterns of the human behavior by using the Hidden Markov Model. First, the method finds foot area of the human by motion detection algorithm from image sequence of the surveillance camera. Then, these foot locus form observation series of features to learn the HMM. The feature that is position of the human foot is changed to each code that corresponds to a specific label among 16 local partitions of image region. Therefore, specific moving patterns formed by the foot locus are the series of the label numbers. The Baum-Welch algorithm of the HMM learns each suspicious and specific pattern to classify the human behaviors. To recognize the inputted human behavior pattern in a test image, the probabilistic comparison between the learned pattern of the HMM and foot series to be tested decides the categorization of the test pattern. The experimental results show that the method can be applied to detect a suspicious person prowling in corridor.

  • PDF

LSTM RNN-based Korean Speech Recognition System Using CTC (CTC를 이용한 LSTM RNN 기반 한국어 음성인식 시스템)

  • Lee, Donghyun;Lim, Minkyu;Park, Hosung;Kim, Ji-Hwan
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.93-99
    • /
    • 2017
  • A hybrid approach using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) has showed great improvement in speech recognition accuracy. For training acoustic model based on hybrid approach, it requires forced alignment of HMM state sequence from Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM). However, high computation time for training GMM-HMM is required. This paper proposes an end-to-end approach for LSTM RNN-based Korean speech recognition to improve learning speed. A Connectionist Temporal Classification (CTC) algorithm is proposed to implement this approach. The proposed method showed almost equal performance in recognition rate, while the learning speed is 1.27 times faster.

Video Based Fall Detection Algorithm Using Hidden Markov Model (은닉 마르코프 모델을 이용한 동영상 기반 낙상 인식 알고리듬)

  • Kim, Nam Ho;Yu, Yun Seop
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.232-237
    • /
    • 2013
  • A newly developed fall detection algorithm using the HMM (Hidden Markov Model) extracted from the video is introduced. To distinguish between the fall from personal difference fall pattern or the normal activities of daily living (ADL), HMM machine learning algorithm is used. For getting fall feature vector of video, the motion vector from the optical flow is applied to the PCA (Principal Component Analysis). The combination of the angle, ratio of long-short axis, velocity from results of PCA make the new fall feature parameters. These parameters were applied to the HMM and the results were compared and analyzed. Among the newly proposed various kinds of fall parameters, the angle of movement showed the best results. The results show that this parameter can distinguish various types of fall from ADLs with 91.5% sensitivity and 88.01% specificity.

HMM-based Speech Recognition using FSVQ and Fuzzy Concept (FSVQ와 퍼지 개념을 이용한 HMM에 기초를 둔 음성 인식)

  • 안태옥
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.90-97
    • /
    • 2003
  • This paper proposes a speech recognition based on HMM(Hidden Markov Model) using FSVQ(First Section Vector Quantization) and fuzzy concept. In the proposed paper, we generate codebook of First Section, and then obtain multi-observation sequences by order of large propabilistic values based on fuzzy rule from the codebook of the first section. Thereafter, this observation sequences of first section from codebooks is trained and in case of recognition, a word that has the most highest probability of first section is selected as a recognized word by same concept. Train station names are selected as the target recognition vocabulary and LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments of proposed method, we experiment the other methods under same conditions and data. Through the experiment results, it is proved that the proposed method based on HMM using FSVQ and fuzzy concept is superior to tile others in recognition rate.

Korean Word Recognition Using Diphone- Level Hidden Markov Model (Diphone 단위 의 hidden Markov model을 이용한 한국어 단어 인식)

  • Park, Hyun-Sang;Un, Chong-Kwan;Park, Yong-Kyu;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1
    • /
    • pp.14-23
    • /
    • 1994
  • In this paper, speech units appropriate for recognition of Korean language have been studied. For better speech recognition, co-articulatory effects within an utterance should be considered in the selection of a recognition unit. One way to model such effects is to use larger units of speech. It has been found that diphone is a good recognition unit because it can model transitional legions explicitly. When diphone is used, stationary phoneme models may be inserted between diphones. Computer simulation for isolated word recognition was done with 7 word database spoken by seven male speakers. Best performance was obtained when transition regions between phonemes were modeled by two-state HMM's and stationary phoneme regions by one-state HMM's excluding /b/, /d/, and /g/. By merging rarely occurring diphone units, the recognition rate was increased from $93.98\%$ to $96.29\%$. In addition, a local interpolation technique was used to smooth a poorly-modeled HMM with a well-trained HMM. With this technique we could get the recognition rate of $97.22\%$ after merging some diphone units.

  • PDF

The Chinese Characters Learning Contents Based on Gesture Recognition Using HMM Algorithm (HMM을 이용한 제스처 인식 기반 한자 학습 콘텐츠)

  • Song, Dae-Hyeon;Kim, Dong-Min;Lee, Chil-Woo
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.8
    • /
    • pp.1067-1074
    • /
    • 2012
  • In this paper, we proposed a contents of Chinese characters learning based on gesture recognition using HMM(hidden markov model) algorithm. Input image of the system is obtained in 3-dimensional information from the TOF camera, and the method of gesture recognition is consisted of part of forecasting user's posture in two infrared images and part of recognizing gestures from continuous poses. In the communication between human and computer, this system provided convenience that user can manipulate it easily by not using any further equipment but action. Because this system raise immersion and interest by using two large display and various multimedia factor, it can maximize information transmission. The edutainment Chinese character contents proposed in this paper provide educational effect that use can master Chinese character naturally with interest, and it can be expected a synergy effect via content experience because it is based on gesture recognition.

Gesture Recognition Using Stereo Tracking Initiator and HMM for Tele-Operation (스테레오 영상 추적 자동초기화와 HMM을 이용한 원격 작업용 제스처 인식)

  • Jeong, Ji-Won;Lee, Yong-Beom;Jin, Seong-Il
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2262-2270
    • /
    • 1999
  • In this paper, we describe gesture recognition algorithm using computer vision sensor and HMM. The automatic hand region extraction has been proposed for initializing the tracking of the tele-operation gestures. For this, distance informations(disparity map) as results of stereo matching of initial left and right images are employed to isolate the hand region from a scene. PDOE(positive difference of edges) feature images adapted here have been found to be robust against noise and background brightness. The KNU/KAERI(K/K) gesture instruction set is defined for tele-operation in atomic electric power stations. The composite recognition model constructed by concatenating three gesture instruction models including pre-orders, basic orders, and post-orders has been proposed and identified by discrete HMM. Our experimental results showed that consecutive orders composed of more than two ones are correctly recognized at the rate of above 97%.

  • PDF