통합 검색 | Korea Science

DHMM을 이용한 한국어 음성 인식 (Korean Speech Recognition using DHMM)

안태옥;이강성;유형근;이형준;조형제;변용규;김순협
- 한국음향학회지
- /
- 제10권1호
- /
- pp.52-60
- /
- 1991
본 연구는 스펙트럼의 동적 특징을 한 파라메타로 하는 DHMM(Dynamic Hidden Markov Model)을 이용한 단독어인식에 관한 것으로 정적 스펙트럼 특징뿐 아니라 동적 스펙트럼 특징을 평가할 수 있는 DHMM에 근거한 음성 인식 실험을 논의 한다. 정적특징으로는 LPC cepstrum 계수를 이용하였고, 동적특징으로는 LPC cepstrum 의 회귀계수를 사용하였다. 이들 두 개의 특징 벡터들을 각각 집단화하여 만든 두 VQ codebook과 입력으로 받아들인 정적 벡터및 동적벡터로 단어들을 DHMM(Dynamic Hidden Markov Model)으로 모델링 하였다. 전체적인 실험에서 기존의 HMM을 이용한 인식실험에서는 88.8%의 인식율을 얻었는데 반해, DHMM을 이용한 인식실험에서는 92.7%의 인식율을 보였다.
PDF

Activity recognition of stroke-affected people using wearable sensor

Anusha David;Rajavel Ramadoss;Amutha Ramachandran;Shoba Sivapatham
- ETRI Journal
- /
- 제45권6호
- /
- pp.1079-1089
- /
- 2023
Stroke is one of the leading causes of long-term disability worldwide, placing huge burdens on individuals and society. Further, automatic human activity recognition is a challenging task that is vital to the future of healthcare and physical therapy. Using a baseline long short-term memory recurrent neural network, this study provides a novel dataset of stretching, upward stretching, flinging motions, hand-to-mouth movements, swiping gestures, and pouring motions for improved model training and testing of stroke-affected patients. A MATLAB application is used to output textual and audible prediction results. A wearable sensor with a triaxial accelerometer is used to collect preprocessed real-time data. The model is trained with features extracted from the actual patient to recognize new actions, and the recognition accuracy provided by multiple datasets is compared based on the same baseline model. When training and testing using the new dataset, the baseline model shows recognition accuracy that is 11% higher than the Activity Daily Living dataset, 22% higher than the Activity Recognition Single Chest-Mounted Accelerometer dataset, and 10% higher than another real-world dataset.
https://doi.org/10.4218/etrij.2022-0242 인용 PDF

다 모델 방식과 모델보상을 통한 잡음환경 음성인식 (A Multi-Model Based Noisy Speech Recognition Using the Model Compensation Method)

정용주;곽성우
- 대한음성학회지:말소리
- /
- 제62호
- /
- pp.97-112
- /
- 2007
The speech recognizer in general operates in noisy acoustical environments. Many research works have been done to cope with the acoustical variations. Among them, the multiple-HMM model approach seems to be quite effective compared with the conventional methods. In this paper, we consider a multiple-model approach combined with the model compensation method and investigate the necessary number of the HMM model sets through noisy speech recognition experiments. By using the data-driven Jacobian adaptation for the model compensation, the multiple-model approach with only a few model sets for each noise type could achieve comparable results with the re-training method.
PDF

Korean Broadcast News Transcription Using Morpheme-based Recognition Units

Kwon, Oh-Wook;Alex Waibel
- The Journal of the Acoustical Society of Korea
- /
- 제21권1E호
- /
- pp.3-11
- /
- 2002
Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.
PDF KSCI

Enhanced Independent Component Analysis of Temporal Human Expressions Using Hidden Markov model

이지준;;김태성
- 한국HCI학회:학술대회논문집
- /
- 한국HCI학회 2008년도 학술대회 1부
- /
- pp.487-492
- /
- 2008
Facial expression recognition is an intensive research area for designing Human Computer Interfaces. In this work, we present a new facial expression recognition system utilizing Enhanced Independent Component Analysis (EICA) for feature extraction and discrete Hidden Markov Model (HMM) for recognition. Our proposed approach for the first time deals with sequential images of emotion-specific facial data analyzed with EICA and recognized with HMM. Performance of our proposed system has been compared to the conventional approaches where Principal and Independent Component Analysis are utilized for feature extraction. Our preliminary results show that our proposed algorithm produces improved recognition rates in comparison to previous works.
PDF

Recognition of Human Facial Expression in a Video Image using the Active Appearance Model

Jo, Gyeong-Sic;Kim, Yong-Guk
- Journal of Information Processing Systems
- /
- 제6권2호
- /
- pp.261-268
- /
- 2010
Tracking human facial expression within a video image has many useful applications, such as surveillance and teleconferencing, etc. Initially, the Active Appearance Model (AAM) was proposed for facial recognition; however, it turns out that the AAM has many advantages as regards continuous facial expression recognition. We have implemented a continuous facial expression recognition system using the AAM. In this study, we adopt an independent AAM using the Inverse Compositional Image Alignment method. The system was evaluated using the standard Cohn-Kanade facial expression database, the results of which show that it could have numerous potential applications.
https://doi.org/10.3745/JIPS.2010.6.2.261 인용 PDF KSCI

Facial Expression Recognition using 1D Transform Features and Hidden Markov Model

Jalal, Ahmad;Kamal, Shaharyar;Kim, Daijin
- Journal of Electrical Engineering and Technology
- /
- 제12권4호
- /
- pp.1657-1662
- /
- 2017
Facial expression recognition systems using video devices have emerged as an important component of natural human-machine interfaces which contribute to various practical applications such as security systems, behavioral science and clinical practices. In this work, we present a new method to analyze, represent and recognize human facial expressions using a sequence of facial images. Under our proposed facial expression recognition framework, the overall procedure includes: accurate face detection to remove background and noise effects from the raw image sequences and align each image using vertex mask generation. Furthermore, these features are reduced by principal component analysis. Finally, these augmented features are trained and tested using Hidden Markov Model (HMM). The experimental evaluation demonstrated the proposed approach over two public datasets such as Cohn-Kanade and AT&T datasets of facial expression videos that achieved expression recognition results as 96.75% and 96.92%. Besides, the recognition results show the superiority of the proposed approach over the state of the art methods.
https://doi.org/10.5370/JEET.2017.12.4.1657 인용 PDF KSCI

은닉 마르코프 모형을 이용한 회전체 결함신호의 패턴 인식 (Pattern Recognition of Rotor Fault Signal Using Bidden Markov Model)

이종민;김승종;황요하;송창섭
- 대한기계학회논문집A
- /
- 제27권11호
- /
- pp.1864-1872
- /
- 2003
Hidden Markov Model(HMM) has been widely used in speech recognition, however, its use in machine condition monitoring has been very limited despite its good potential. In this paper, HMM is used to recognize rotor fault pattern. First, we set up rotor kit under unbalance and oil whirl conditions. Time signals of two failure conditions were sampled and translated to auto power spectrums. Using filter bank, feature vectors were calculated from these auto power spectrums. Next, continuous HMM and discrete HMM were trained with scaled forward/backward variables and diagonal covariance matrix. Finally, each HMM was applied to all sampled data to prove fault recognition ability. It was found that HMM has good recognition ability despite of small number of training data set in rotor fault pattern recognition.
https://doi.org/10.3795/KSME-A.2003.27.11.1864 인용 PDF KSCI

A Study on Character Recognition using HMM and the Mason's Theorem

Lee Sang-kyu;Hur Jung-youn
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
- /
- pp.259-262
- /
- 2004
In most of the character recognition systems, the method of template matching or statistical method using hidden Markov model is used to extract and recognize feature shapes. In this paper, we used modified chain-code which has 8-directions but 4-codes, and made the chain-code of hand-written character, after that, converted it into transition chain-code by applying to HMM(Hidden Markov Model). The transition chain code by HMM is analyzed as signal flow graph by Mason's theory which is generally used to calculate forward gain at automatic control system. If the specific forward gain and feedback gain is properly set, the forward gain of transition chain-code using Mason's theory can be distinguished depending on each object for recognition. This data of the gain is reorganized as tree structure, hence making it possible to distinguish different hand-written characters. With this method, $91\%$ recognition rate was acquired.
PDF

원어민 및 외국인 화자의 음성인식을 위한 심층 신경망 기반 음향모델링 (DNN-based acoustic modeling for speech recognition of native and foreign speakers)

강병옥;권오욱
- 말소리와 음성과학
- /
- 제9권2호
- /
- pp.95-101
- /
- 2017
This paper proposes a new method to train Deep Neural Network (DNN)-based acoustic models for speech recognition of native and foreign speakers. The proposed method consists of determining multi-set state clusters with various acoustic properties, training a DNN-based acoustic model, and recognizing speech based on the model. In the proposed method, hidden nodes of DNN are shared, but output nodes are separated to accommodate different acoustic properties for native and foreign speech. In an English speech recognition task for speakers of Korean and English respectively, the proposed method is shown to slightly improve recognition accuracy compared to the conventional multi-condition training method.
https://doi.org/10.13064/KSSS.2017.9.2.095 인용 PDF KSCI

검색결과 3,431건 처리시간 0.032초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)