• Title/Summary/Keyword: Markov feature

Search Result 195, Processing Time 0.025 seconds

Filtering of Filter-Bank Energies for Robust Speech Recognition

  • Jung, Ho-Young
    • ETRI Journal
    • /
    • v.26 no.3
    • /
    • pp.273-276
    • /
    • 2004
  • We propose a novel feature processing technique which can provide a cepstral liftering effect in the log-spectral domain. Cepstral liftering aims at the equalization of variance of cepstral coefficients for the distance-based speech recognizer, and as a result, provides the robustness for additive noise and speaker variability. However, in the popular hidden Markov model based framework, cepstral liftering has no effect in recognition performance. We derive a filtering method in log-spectral domain corresponding to the cepstral liftering. The proposed method performs a high-pass filtering based on the decorrelation of filter-bank energies. We show that in noisy speech recognition, the proposed method reduces the error rate by 52.7% to conventional feature.

  • PDF

Modeling feature inference in causal categories (인과적 범주의 속성추론 모델링)

  • Kim, ShinWoo;Li, Hyung-Chul O.
    • Korean Journal of Cognitive Science
    • /
    • v.28 no.4
    • /
    • pp.329-347
    • /
    • 2017
  • Early research into category-based feature inference reported various phenomena in human thinking including typicality, diversity, similarity effects, etc. Later research discovered that participants' prior knowledge has an extensive influence on these sorts of reasoning. The current research tested the effects of causal knowledge on feature inference and conducted modeling on the results. Participants performed feature inference for categories consisted of four features where the features were connected either in common cause or common effect structure. The results showed typicality effects along with violations of causal Markov condition in common cause structure and causal discounting in common effect structure. To model the results, it was assumed that participants perform feature inference based on the difference between the probabilities of an exemplar with the target feature and an exemplar without the target feature (that is, $p(E_{F(X)}{\mid}Cat)-p(E_{F({\sim}X)}{\mid}Cat)$). Exemplar probabilities were computed based on causal model theory (Rehder, 2003) and applied to inference for target features. The results showed that the model predicts not only typicality effects but also violations of causal Markov condition and causal discounting observed in participants' data.

Statistical Speech Feature Selection for Emotion Recognition

  • Kwon Oh-Wook;Chan Kwokleung;Lee Te-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4E
    • /
    • pp.144-151
    • /
    • 2005
  • We evaluate the performance of emotion recognition via speech signals when a plain speaker talks to an entertainment robot. For each frame of a speech utterance, we extract the frame-based features: pitch, energy, formant, band energies, mel frequency cepstral coefficients (MFCCs), and velocity/acceleration of pitch and MFCCs. For discriminative classifiers, a fixed-length utterance-based feature vector is computed from the statistics of the frame-based features. Using a speaker-independent database, we evaluate the performance of two promising classifiers: support vector machine (SVM) and hidden Markov model (HMM). For angry/bored/happy/neutral/sad emotion classification, the SVM and HMM classifiers yield $42.3\%\;and\;40.8\%$ accuracy, respectively. We show that the accuracy is significant compared to the performance by foreign human listeners.

Content-based Image Retrieval using an Improved Chain Code and Hidden Markov Model (개선된 chain code와 HMM을 이용한 내용기반 영상검색)

  • 조완현;이승희;박순영;박종현
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.375-378
    • /
    • 2000
  • In this paper, we propose a novo] content-based image retrieval system using both Hidden Markov Model(HMM) and an improved chain code. The Gaussian Mixture Model(GMM) is applied to statistically model a color information of the image, and Deterministic Annealing EM(DAEM) algorithm is employed to estimate the parameters of GMM. This result is used to segment the given image. We use an improved chain code, which is invariant to rotation, translation and scale, to extract the feature vectors of the shape for each image in the database. These are stored together in the database with each HMM whose parameters (A, B, $\pi$) are estimated by Baum-Welch algorithm. With respect to feature vector obtained in the same way from the query image, a occurring probability of each image is computed by using the forward algorithm of HMM. We use these probabilities for the image retrieval and present the highest similarity images based on these probabilities.

  • PDF

HMM-based missing feature reconstruction for robust speech recognition in additive noise environments (가산잡음환경에서 강인음성인식을 위한 은닉 마르코프 모델 기반 손실 특징 복원)

  • Cho, Ji-Won;Park, Hyung-Min
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.127-132
    • /
    • 2014
  • This paper describes a robust speech recognition technique by reconstructing spectral components mismatched with a training environment. Although the cluster-based reconstruction method can compensate the unreliable components from reliable components in the same spectral vector by assuming an independent, identically distributed Gaussian-mixture process of training spectral vectors, the presented method exploits the temporal dependency of speech to reconstruct the components by introducing a hidden-Markov-model prior which incorporates an internal state transition plausible for an observed spectral vector sequence. The experimental results indicate that the described method can provide temporally consistent reconstruction and further improve recognition performance on average compared to the conventional method.

HMM-Based Automatic Speech Recognition using EMG Signal

  • Lee Ki-Seung
    • Journal of Biomedical Engineering Research
    • /
    • v.27 no.3
    • /
    • pp.101-109
    • /
    • 2006
  • It has been known that there is strong relationship between human voices and the movements of the articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The EMG signals were acquired from three articulatory facial muscles. Preliminary, 10 Korean digits were used as recognition variables. The various feature parameters including filter bank outputs, linear predictive coefficients and cepstrum coefficients were evaluated to find the appropriate parameters for EMG-based speech recognition. The sequence of the EMG signals for each word is modelled by a hidden Markov model (HMM) framework. A continuous word recognition approach was investigated in this work. Hence, the model for each word is obtained by concatenating the subword models and the embedded re-estimation techniques were employed in the training stage. The findings indicate that such a system may have a capacity to recognize speech signals with an accuracy of up to 90%, in case when mel-filter bank output was used as the feature parameters for recognition.

A Study on the Criteria to Decide the Number of Aircrafts Considering Operational Characteristics (항공기 운용 특성을 고려한 적정 운용 대수 산정 기준 연구)

  • Son, Young-Su;Kim, Seong-Woo;Yoon, Bong-Kyoo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.1
    • /
    • pp.41-49
    • /
    • 2014
  • In this paper, we consider a method to access the number of aircraft requirement which is a strategic variable in national security. This problem becomes more important considering the F-X and KF-X project in ROKAF. Traditionally, ATO(Air Tasking Order) and fighting power index have been used to evaluate the number of aircrafts required in ROKAF. However, those methods considers static aspect of aircraft requirement. This paper deals with a model to accommodate dynamic feature of aircraft requirement using absorbing Markov chain. In conclusion, we suggest a dynamic model to evaluate the number of aircrafts required with key decision variables such as destroying rate, failure rate and repair rate.

Applying feature normalization based on pole filtering to short-utterance speech recognition using deep neural network (심층신경망을 이용한 짧은 발화 음성인식에서 극점 필터링 기반의 특징 정규화 적용)

  • Han, Jaemin;Kim, Min Sik;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.1
    • /
    • pp.64-68
    • /
    • 2020
  • In a conventional speech recognition system using Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), the cepstral feature normalization method based on pole filtering was effective in improving the performance of recognition of short utterances in noisy environments. In this paper, the usefulness of this method for the state-of-the-art speech recognition system using Deep Neural Network (DNN) is examined. Experimental results on AURORA 2 DB show that the cepstral mean and variance normalization based on pole filtering improves the recognition performance of very short utterances compared to that without pole filtering, especially when there is a large mismatch between the training and test conditions.

Analyzing performance of time series classification using STFT and time series imaging algorithms

  • Sung-Kyu Hong;Sang-Chul Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.1-11
    • /
    • 2023
  • In this paper, instead of using recurrent neural network, we compare a classification performance of time series imaging algorithms using convolution neural network. There are traditional algorithms that imaging time series data (e.g. GAF(Gramian Angular Field), MTF(Markov Transition Field), RP(Recurrence Plot)) in TSC(Time Series Classification) community. Furthermore, we compare STFT(Short Time Fourier Transform) algorithm that can acquire spectrogram that visualize feature of voice data. We experiment CNN's performance by adjusting hyper parameters of imaging algorithms. When evaluate with GunPoint dataset in UCR archive, STFT(Short-Time Fourier transform) has higher accuracy than other algorithms. GAF has 98~99% accuracy either, but there is a disadvantage that size of image is massive.

Video-based fall detection algorithm combining simple threshold method and Hidden Markov Model (단순 임계치와 은닉마르코프 모델을 혼합한 영상 기반 낙상 알고리즘)

  • Park, Culho;Yu, Yun Seop
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.9
    • /
    • pp.2101-2108
    • /
    • 2014
  • Automatic fall-detection algorithms using video-data are proposed. Six types of fall-feature parameters are defined applying the optical flows extracted from differential images to principal component analysis(PCA). One fall-detection algorithm is the simple threshold method that a fall is detected when a fall-feature parameter is over a threshold, another is to use the HMM, and the other is to combine the simple threshold and HMM. Comparing the performances of three types of fall-detection algorithm, the algorithm combining the simple threshold and HMM requires less computational resources than HMM and exhibits a higher accuracy than the simple threshold method.