통합 검색 | Korea Science

Filtering of Filter-Bank Energies for Robust Speech Recognition

Jung, Ho-Young
- ETRI Journal
- /
- 제26권3호
- /
- pp.273-276
- /
- 2004
We propose a novel feature processing technique which can provide a cepstral liftering effect in the log-spectral domain. Cepstral liftering aims at the equalization of variance of cepstral coefficients for the distance-based speech recognizer, and as a result, provides the robustness for additive noise and speaker variability. However, in the popular hidden Markov model based framework, cepstral liftering has no effect in recognition performance. We derive a filtering method in log-spectral domain corresponding to the cepstral liftering. The proposed method performs a high-pass filtering based on the decorrelation of filter-bank energies. We show that in noisy speech recognition, the proposed method reduces the error rate by 52.7% to conventional feature.
PDF

인과적 범주의 속성추론 모델링 (Modeling feature inference in causal categories)

김신우;이형철
- 인지과학
- /
- 제28권4호
- /
- pp.329-347
- /
- 2017
범주기반 속성추론에 대한 초기연구들은 전형성, 다양성, 유사성 효과 등 인간 사고에서 나타나는 다양한 현상들을 보고하였다. 이후 연구들은 이러한 추론에서 참가자들의 사전지식이 광범위한 영향을 미친다는 것을 발견하였다. 본 연구에서는 다양한 사전지식들 중 하나인 인과적 지식이 속성추론에 미치는 영향을 검증하고 이를 모델링하였다. 이를 위해 참가자들은 네 개의 속성으로 구성된 범주에서 속성들이 공통원인 혹은 공통효과 인과구조로 연결되었을 때 속성추론과제를 실시하였다. 그 결과 전형성 효과와 더불어 공통원인 구조에서 인과적 마코프 조건(causal Markov condition)에 대한 위배와 공통효과 구조에서 인과적 절감(causal discounting)이 관찰되었다. 이를 모델링하기 위해 참가자들은 표적속성이 존재하는 범주예시와 존재하지 않은 범주예시가 존재할 가능성에 대한 차이값 (즉, $p(E_{F(X)}{\mid}Cat)-p(E_{F({\sim}X)}{\mid}Cat)$에 근거하여 속성추론을 수행한다고 가정하였다. 인과모형이론(Rehder, 2003)에 기반하여 범주예시들의 확률값을 계산한 후 각 표적속성에 대한 추론에 적용하였다. 그 결과 모형은 참가자들의 데이터에서 관찰된 전형성 효과뿐만 아니라 인과적 마코프 조건에 대한 위배 및 인과적 절감을 모두 예측한다는 것이 확인되었다.
https://doi.org/10.19066/cogsci.2017.28.4.007 인용 PDF

Statistical Speech Feature Selection for Emotion Recognition

Kwon Oh-Wook;Chan Kwokleung;Lee Te-Won
- The Journal of the Acoustical Society of Korea
- /
- 제24권4E호
- /
- pp.144-151
- /
- 2005
We evaluate the performance of emotion recognition via speech signals when a plain speaker talks to an entertainment robot. For each frame of a speech utterance, we extract the frame-based features: pitch, energy, formant, band energies, mel frequency cepstral coefficients (MFCCs), and velocity/acceleration of pitch and MFCCs. For discriminative classifiers, a fixed-length utterance-based feature vector is computed from the statistics of the frame-based features. Using a speaker-independent database, we evaluate the performance of two promising classifiers: support vector machine (SVM) and hidden Markov model (HMM). For angry/bored/happy/neutral/sad emotion classification, the SVM and HMM classifiers yield $42.3\%\;and\;40.8\%$ accuracy, respectively. We show that the accuracy is significant compared to the performance by foreign human listeners.
PDF KSCI

개선된 chain code와 HMM을 이용한 내용기반 영상검색 (Content-based Image Retrieval using an Improved Chain Code and Hidden Markov Model)

조완현;이승희;박순영;박종현
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2000년도 제13회 신호처리 합동 학술대회 논문집
- /
- pp.375-378
- /
- 2000
In this paper, we propose a novo] content-based image retrieval system using both Hidden Markov Model(HMM) and an improved chain code. The Gaussian Mixture Model(GMM) is applied to statistically model a color information of the image, and Deterministic Annealing EM(DAEM) algorithm is employed to estimate the parameters of GMM. This result is used to segment the given image. We use an improved chain code, which is invariant to rotation, translation and scale, to extract the feature vectors of the shape for each image in the database. These are stored together in the database with each HMM whose parameters (A, B, $\pi$) are estimated by Baum-Welch algorithm. With respect to feature vector obtained in the same way from the query image, a occurring probability of each image is computed by using the forward algorithm of HMM. We use these probabilities for the image retrieval and present the highest similarity images based on these probabilities.
PDF

가산잡음환경에서 강인음성인식을 위한 은닉 마르코프 모델 기반 손실 특징 복원 (HMM-based missing feature reconstruction for robust speech recognition in additive noise environments)

조지원;박형민
- 말소리와 음성과학
- /
- 제6권4호
- /
- pp.127-132
- /
- 2014
This paper describes a robust speech recognition technique by reconstructing spectral components mismatched with a training environment. Although the cluster-based reconstruction method can compensate the unreliable components from reliable components in the same spectral vector by assuming an independent, identically distributed Gaussian-mixture process of training spectral vectors, the presented method exploits the temporal dependency of speech to reconstruct the components by introducing a hidden-Markov-model prior which incorporates an internal state transition plausible for an observed spectral vector sequence. The experimental results indicate that the described method can provide temporally consistent reconstruction and further improve recognition performance on average compared to the conventional method.
https://doi.org/10.13064/KSSS.2014.6.4.127 인용 PDF KSCI

HMM-Based Automatic Speech Recognition using EMG Signal

Lee Ki-Seung
- 대한의용생체공학회:의공학회지
- /
- 제27권3호
- /
- pp.101-109
- /
- 2006
It has been known that there is strong relationship between human voices and the movements of the articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The EMG signals were acquired from three articulatory facial muscles. Preliminary, 10 Korean digits were used as recognition variables. The various feature parameters including filter bank outputs, linear predictive coefficients and cepstrum coefficients were evaluated to find the appropriate parameters for EMG-based speech recognition. The sequence of the EMG signals for each word is modelled by a hidden Markov model (HMM) framework. A continuous word recognition approach was investigated in this work. Hence, the model for each word is obtained by concatenating the subword models and the embedded re-estimation techniques were employed in the training stage. The findings indicate that such a system may have a capacity to recognize speech signals with an accuracy of up to 90%, in case when mel-filter bank output was used as the feature parameters for recognition.
https://doi.org/10.9718/JBER.2006.27.3.101 인용 PDF KSCI

항공기 운용 특성을 고려한 적정 운용 대수 산정 기준 연구 (A Study on the Criteria to Decide the Number of Aircrafts Considering Operational Characteristics)

손영수;김성우;윤봉규
- 한국군사과학기술학회지
- /
- 제17권1호
- /
- pp.41-49
- /
- 2014
In this paper, we consider a method to access the number of aircraft requirement which is a strategic variable in national security. This problem becomes more important considering the F-X and KF-X project in ROKAF. Traditionally, ATO(Air Tasking Order) and fighting power index have been used to evaluate the number of aircrafts required in ROKAF. However, those methods considers static aspect of aircraft requirement. This paper deals with a model to accommodate dynamic feature of aircraft requirement using absorbing Markov chain. In conclusion, we suggest a dynamic model to evaluate the number of aircrafts required with key decision variables such as destroying rate, failure rate and repair rate.
https://doi.org/10.9766/KIMST.2014.17.1.041 인용 PDF KSCI

심층신경망을 이용한 짧은 발화 음성인식에서 극점 필터링 기반의 특징 정규화 적용 (Applying feature normalization based on pole filtering to short-utterance speech recognition using deep neural network)

한재민;김민식;김형순
- 한국음향학회지
- /
- 제39권1호
- /
- pp.64-68
- /
- 2020
가우스 혼합 모델-은닉 마코프 모델(Gaussian Mixture Model-Hidden Markov Model, GMM-HMM)을 이용하는 전통적인 음성인식 시스템에서는, 극점 필터링 기반의 켑스트럼 특징 정규화 방식이 잡음 환경에서 짧은 발화의 인식 성능을 향상시키는데 효과적이었다. 본 논문에서는 심층신경망(Deep Neural Network, DNN)을 이용하는 최신의 음성인식 시스템에서도 이 방식의 유용성이 있는지 검토한다. AURORA 2 DB에 대한 실험 결과, 특히 훈련 및 테스트 환경 사이의 불일치가 클 때에, 극점 필터링 기반의 켑스트럼 평균 분산 정규화 방식이 극점 필터링을 사용하지 않는 방식에 비해 매우 짧은 발화의 인식 성능을 개선시킴을 보여 준다.
https://doi.org/10.7776/ASK.2020.39.1.064 인용 PDF KSCI

Analyzing performance of time series classification using STFT and time series imaging algorithms

Sung-Kyu Hong;Sang-Chul Kim
- 한국컴퓨터정보학회논문지
- /
- 제28권4호
- /
- pp.1-11
- /
- 2023
본 논문은 순환 신경망 대신 합성곱 신경망을 사용하여 시계열 데이터 분류 성능을 분석한다. TSC(Time Series Community)에는 GAF(Gramian Angular Field), MTF(Markov Transition Field), RP(Recurrence Plot)와 같은 전통적인 시계열 데이터 이미지화 알고리즘들이 있다. 실험은 이미지화 알고리즘들에 필요한 하이퍼 파라미터들을 조정하면서 합성곱 신경망의 성능을 평가하는 방식으로 진행된다. UCR 아카이브의 GunPoint 데이터셋을 기준으로 성능을 평가했을 때, 본 논문에서 제안하는 STFT(Short Time Fourier Transform) 알고리즘이 최적화된 하이퍼 파라미터를 찾은 경우, 기존의 알고리즘들 대비 정확도가 높고, 동적으로 feature map 이미지의 크기도 조절가능하다는 장점이 있다. GAF 또한 98~99%의 높은 정확도를 보이지만, feature map 이미지의 크기를 동적으로 조절할 수 없어 크다는 단점이 존재한다.
https://doi.org/10.9708/jksci.2023.28.04.001 인용 PDF HTML

단순 임계치와 은닉마르코프 모델을 혼합한 영상 기반 낙상 알고리즘 (Video-based fall detection algorithm combining simple threshold method and Hidden Markov Model)

박철호;유윤섭
- 한국정보통신학회논문지
- /
- 제18권9호
- /
- pp.2101-2108
- /
- 2014
영상 정보를 이용한 자동 낙상 감지 알고리즘을 제안한다. 자동으로 낙상을 감지하기 위한 낙상 특징 파라미터를 추출하기 위해서 영상정보를 광류 방식에 적용하여 움직임 값들을 추출하고 이 움직임 값들에 대한 전체적인 변화의 정도와 기울기, 중심점을 주성분 분석 방법으로 계산한다. 계산된 고유값과 고유 벡터를 사용하여 6가지 낙상 특징 파라미터를 정의한다. 이 낙상특징파라미터가 미리 정해둔 임계값을 초과하는 경우를 낙상으로 판단하는 단순 임계치 방법과 낙상특징파라미터를 은닉 마르코프 모델(Hidden Markov Model; HMM)에 적용시켜 낙상을 판단하는 방법과 단순임계치와 은닉 마르코프 모델을 결합한 낙상 감지 방법을 제안하고 그 결과를 비교 및 분석한다. 단순 임계치와 은닉 마르코프 모델을 결합한 방법은 단순임계치 방법으로 낙상 가능한 행동들을 결정하고 이 결정된 낙상 행동들만을 은닉 마르코프 모델을 적용하여 낙상을 감지한다. 이 방법은 계산량을 줄이면서 감지 정확도를 유지하는 결과를 보인다.
https://doi.org/10.6109/jkiice.2014.18.9.2101 인용 PDF KSCI

검색결과 195건 처리시간 0.031초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)