Search | Korea Science

A Study on the Redundancy Reduction in Speech Recognition (음성인식에서 중복성의 저감에 대한 연구)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.7 no.3
- /
- pp.475-483
- /
- 2012
The characteristic features of speech signal do not vary significantly from frame to frame. Therefore, it is advisable to reduce the redundancy involved in the similar feature vectors. The objective of this paper is to search for the optimal condition of minimum redundancy and maximum relevancy of the speech feature vectors in speech recognition. For this purpose, we realize redundancy reduction by way of a vigilance parameter and investigate the resultant effect on the speaker-independent speech recognition of isolated words by using FVQ/HMM. Experimental results showed that the number of feature vectors might be reduced by 30% without deteriorating the speech recognition accuracy.
https://doi.org/10.13067/JKIECS.2012.7.3.475 인용 PDF KSCI

A Study on Road Detection Based on MRF in SAR Image (SAR 영상에서 MRF 기반 도로 검출에 관한 연구)

김순백;김두영
- Journal of the Institute of Convergence Signal Processing
- /
- v.2 no.2
- /
- pp.7-12
- /
- 2001
In this paper, an estimation method of hybrid feature was proposed to detect linear feature such as the road network from SAR(synthetics aperture radar) images that include speckle noise. First we considered the mean intensity ratio or the statistical properties of locality neighboring regions to detect linear feature of road. The responses of both methods are combined to detect the entire road network. The purpose of this paper is to extract the segments of road and to mutually connect them according to the identical intensity road from the locally detected fusing images. The algorithm proposed in this paper is to define MRF(markov random field) model of the priori knowledge on the roads and applied it to energy function of interacting density points, and to detect the road networks by optimizing the energy function.
PDF

Feature Extraction Method of 2D-DCT for Facial Expression Recognition (얼굴 표정인식을 위한 2D-DCT 특징추출 방법)

Kim, Dong-Ju;Lee, Sang-Heon;Sohn, Myoung-Kyu
- KIPS Transactions on Software and Data Engineering
- /
- v.3 no.3
- /
- pp.135-138
- /
- 2014
This paper devices a facial expression recognition method robust to overfitting using 2D-DCT and EHMM algorithm. In particular, this paper achieves enhanced recognition performance by setting up a large window size for 2D-DCT feature extraction and extracting the observation vectors of EHMM. The experimental results on the CK facial expression database and the JAFFE facial expression database showed that the facial expression recognition accuracy was improved according as window size is large. Also, the proposed method revealed the recognition accuracy of 87.79% and showed enhanced recognition performance ranging from 46.01% to 50.05% in comparison to previous approaches based on histogram feature, when CK database is employed for training and JAFFE database is used to test the recognition accuracy.
https://doi.org/10.3745/KTSDE.2014.3.3.135 인용 PDF KSCI

Automated epileptic seizure waveform detection method based on the feature of the mean slope of wavelet coefficient counts using a hidden Markov model and EEG signals

Lee, Miran;Ryu, Jaehwan;Kim, Deok-Hwan
- ETRI Journal
- /
- v.42 no.2
- /
- pp.217-229
- /
- 2020
Long-term electroencephalography (EEG) monitoring is time-consuming, and requires experts to interpret EEG signals to detect seizures in patients. In this paper, we propose a novel automated method called adaptive slope of wavelet coefficient counts over various thresholds (ASCOT) to classify patient episodes as seizure waveforms. ASCOT involves extracting the feature matrix by calculating the mean slope of wavelet coefficient counts over various thresholds in each frequency subband. We validated our method using our own database and a public database to avoid overtuning. The experimental results show that the proposed method achieved a reliable and promising accuracy in both our own database (98.93%) and the public database (99.78%). Finally, we evaluated the performance of the method considering various window sizes. In conclusion, the proposed method achieved a reliable seizure detection performance with a short-term window size. Therefore, our method can be utilized to interpret long-term EEG results and detect momentary seizure waveforms in diagnostic systems.
https://doi.org/10.4218/etrij.2018-0118 인용 PDF KSCI

Emotion recognition in speech using hidden Markov model (은닉 마르코프 모델을 이용한 음성에서의 감정인식)

김성일;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.3 no.3
- /
- pp.21-26
- /
- 2002
This paper presents the new approach of identifying human emotional states such as anger, happiness, normal, sadness, or surprise. This is accomplished by using discrete duration continuous hidden Markov models(DDCHMM). For this, the emotional feature parameters are first defined from input speech signals. In this study, we used prosodic parameters such as pitch signals, energy, and their each derivative, which were then trained by HMM for recognition. Speaker adapted emotional models based on maximum a posteriori(MAP) estimation were also considered for speaker adaptation. As results, the simulation performance showed that the recognition rates of vocal emotion gradually increased with an increase of adaptation sample number.
PDF

Adaptive Korean Continuous Speech Recognizer to Speech Rate (발화속도 적응적인 한국어 연속음 인식기)

Kim, Jae-Beom;Park, Chan-Kyu;Han, Mi-Sung;Lee, Jung-Hyun
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.6
- /
- pp.1531-1540
- /
- 1997
In this paper, we presents automatic Korean continuous speech recognizer which is improved by the speech rate estimation and the compensation methods. Automatic continuous speech recognition is significantly more difficult than isolated word recognition because of coarticulatory effects and variations in speech rate. In order to recognize continuous speech, modeling methods of coarticulatory effects and variations in speech rate are needed. In this paper, the speech rate is measured by change of format, and the compensation is peformed by extracting relatively many feature vectors in fast speech. Coarticulatory effects are modeled by defining 514 Korean diphone set, and ETRI's 445 word DB is used for training speech material. With combining above methods, we implement automatic Korean continuous speech recognizer, which shows improved recognition rate, based on DHMM(Discrete Hidden Markov Model).
PDF

Image Segmentation Based on Fusion of Range and Intensity Images (거리영상과 밝기영상의 fusion을 이용한 영상분할)

Chang, In-Su;Park, Rae-Hong
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.9
- /
- pp.95-103
- /
- 1998
This paper proposes an image segmentation algorithm based on fusion of range and intensity images. Based on Bayesian theory, a priori knowledge is encoded by the Markov random field (MRF). A maximum a posteriori (MAP) estimator is constructed using the features extracted from range and intensity images. Objects are approximated by local planar surfaces in range images, and the parametric space is constructed with the surface parameters estimated pixelwise. In intensity images the ${\alpha}$-trimmed variance constructs the intensity feature. An image is segmented by optimizing the MAP estimator that is constructed using a likelihood function based on edge information. Computer simulation results shw that the proposed fusion algorithm effectively segments the images independentl of shadow, noise, and light-blurring.
PDF

Emotion Recognition Based on Human Gesture (인간의 제스쳐에 의한 감정 인식)

Song, Min-Kook;Park, Jin-Bae;Joo, Young-Hoon
- Journal of the Korean Institute of Intelligent Systems
- /
- v.17 no.1
- /
- pp.46-51
- /
- 2007
This paper is to present gesture analysis for human-robot interaction. Understanding human emotions through gesture is one of the necessary skills fo the computers to interact intelligently with their human counterparts. Gesture analysis is consisted of several processes such as detecting of hand, extracting feature, and recognizing emotions. For efficient operation we used recognizing a gesture with HMM(Hidden Markov Model). We constructed a large gesture database, with which we verified our method. As a result, our method is successfully included and operated in a mobile system.
https://doi.org/10.5391/JKIIS.2007.17.1.046 인용 PDF KSCI

Improvement and Evaluation of the Korean Large Vocabulary Continuous Speech Recognition Platform (ECHOS) (한국어 음성인식 플랫폼(ECHOS)의 개선 및 평가)

Kwon, Suk-Bong;Yun, Sung-Rack;Jang, Gyu-Cheol;Kim, Yong-Rae;Kim, Bong-Wan;Kim, Hoi-Rin;Yoo, Chang-Dong;Lee, Yong-Ju;Kwon, Oh-Wook
- MALSORI
- /
- no.59
- /
- pp.53-68
- /
- 2006
We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.
PDF

A Local Feature-Based Robust Approach for Facial Expression Recognition from Depth Video

Uddin, Md. Zia;Kim, Jaehyoun
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.10 no.3
- /
- pp.1390-1403
- /
- 2016
Facial expression recognition (FER) plays a very significant role in computer vision, pattern recognition, and image processing applications such as human computer interaction as it provides sufficient information about emotions of people. For video-based facial expression recognition, depth cameras can be better candidates over RGB cameras as a person's face cannot be easily recognized from distance-based depth videos hence depth cameras also resolve some privacy issues that can arise using RGB faces. A good FER system is very much reliant on the extraction of robust features as well as recognition engine. In this work, an efficient novel approach is proposed to recognize some facial expressions from time-sequential depth videos. First of all, efficient Local Binary Pattern (LBP) features are obtained from the time-sequential depth faces that are further classified by Generalized Discriminant Analysis (GDA) to make the features more robust and finally, the LBP-GDA features are fed into Hidden Markov Models (HMMs) to train and recognize different facial expressions successfully. The depth information-based proposed facial expression recognition approach is compared to the conventional approaches such as Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Linear Discriminant Analysis (LDA) where the proposed one outperforms others by obtaining better recognition rates.
https://doi.org/10.3837/tiis.2016.03.026 인용 PDF KSCI KPUBS HTML

Search Result 195, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)