Search | Korea Science

Noise Robust Emotion Recognition Feature : Frequency Range of Meaningful Signal (음성의 특정 주파수 범위를 이용한 잡음환경에서의 감정인식)

Kim Eun-Ho;Hyun Kyung-Hak;Kwak Yoon-Keun
- Journal of the Korean Society for Precision Engineering
- /
- v.23 no.5 s.182
- /
- pp.68-76
- /
- 2006
The ability to recognize human emotion is one of the hallmarks of human-robot interaction. Hence this paper describes the realization of emotion recognition. For emotion recognition from voice, we propose a new feature called frequency range of meaningful signal. With this feature, we reached average recognition rate of 76% in speaker-dependent. From the experimental results, we confirm the usefulness of the proposed feature. We also define the noise environment and conduct the noise-environment test. In contrast to other features, the proposed feature is robust in a noise-environment.
PDF KSCI

Constructing a Noise-Robust Speech Recognition System using Acoustic and Visual Information (청각 및 시가 정보를 이용한 강인한 음성 인식 시스템의 구현)

Lee, Jong-Seok;Park, Cheol-Hoon
- Journal of Institute of Control, Robotics and Systems
- /
- v.13 no.8
- /
- pp.719-725
- /
- 2007
In this paper, we present an audio-visual speech recognition system for noise-robust human-computer interaction. Unlike usual speech recognition systems, our system utilizes the visual signal containing speakers' lip movements along with the acoustic signal to obtain robust speech recognition performance against environmental noise. The procedures of acoustic speech processing, visual speech processing, and audio-visual integration are described in detail. Experimental results demonstrate the constructed system significantly enhances the recognition performance in noisy circumstances compared to acoustic-only recognition by using the complementary nature of the two signals.
https://doi.org/10.5302/J.ICROS.2007.13.8.719 인용 PDF KSCI

RECOGNITION SYSTEM USING VOCAL-CORD SIGNAL (성대 신호를 이용한 인식 시스템)

Cho, Kwan-Hyun;Han, Mun-Sung;Park, Jun-Seok;Jeong, Young-Gyu
- Proceedings of the KIEE Conference
- /
- 2005.10b
- /
- pp.216-218
- /
- 2005
This paper present a new approach to a noise robust recognizer for WPS interface. In noisy environments, performance of speech recognition is decreased rapidly. To solve this problem, We propose the recognition system using vocal-cord signal instead of speech. Vocal-cord signal has low quality but it is more robust to environment noise than speech signal. As a result, we obtained 75.21% accuracy using MFCC with CMS and 83.72% accuracy using ZCPA with RASTA.
PDF

Echo Noise Robust HMM Learning Model using Average Estimator LMS Algorithm (평균 예측 LMS 알고리즘을 이용한 반향 잡음에 강인한 HMM 학습 모델)

Ahn, Chan-Shik;Oh, Sang-Yeob
- Journal of Digital Convergence
- /
- v.10 no.10
- /
- pp.277-282
- /
- 2012
The speech recognition system can not quickly adapt to varied environmental noise factors that degrade the performance of recognition. In this paper, the echo noise robust HMM learning model using average estimator LMS algorithm is proposed. To be able to adapt to the changing echo noise HMM learning model consists of the recognition performance is evaluated. As a results, SNR of speech obtained by removing Changing environment noise is improved as average 3.1dB, recognition rate improved as 3.9%.
https://doi.org/10.14400/JDPM.2012.10.10.277 인용 PDF

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
- ETRI Journal
- /
- v.46 no.1
- /
- pp.22-34
- /
- 2024
Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.
https://doi.org/10.4218/etrij.2023-0266 인용 PDF

Harmonics-based Spectral Subtraction and Feature Vector Normalization for Robust Speech Recognition

Beh, Joung-Hoon;Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
- Speech Sciences
- /
- v.11 no.1
- /
- pp.7-20
- /
- 2004
In this paper, we propose a two-step noise compensation algorithm in feature extraction for achieving robust speech recognition. The proposed method frees us from requiring a priori information on noisy environments and is simple to implement. First, in frequency domain, the Harmonics-based Spectral Subtraction (HSS) is applied so that it reduces the additive background noise and makes the shape of harmonics in speech spectrum more pronounced. We then apply a judiciously weighted variance Feature Vector Normalization (FVN) to compensate for both the channel distortion and additive noise. The weighted variance FVN compensates for the variance mismatch in both the speech and the non-speech regions respectively. Representative performance evaluation using Aurora 2 database shows that the proposed method yields 27.18% relative improvement in accuracy under a multi-noise training task and 57.94% relative improvement under a clean training task.
PDF

A Spectral Compensation Method for Noise Robust Speech Recognition (잡음에 강인한 음성인식을 위한 스펙트럼 보상 방법)

Cho, Jung-Ho
- 전자공학회논문지 IE
- /
- v.49 no.2
- /
- pp.9-17
- /
- 2012
One of the problems on the application of the speech recognition system in the real world is the degradation of the performance by acoustical distortions. The most important source of acoustical distortion is the additive noise. This paper describes a spectral compensation technique based on a spectral peak enhancement scheme followed by an efficient noise subtraction scheme for noise robust speech recognition. The proposed methods emphasize the formant structure and compensate the spectral tilt of the speech spectrum while maintaining broad-bandwidth spectral components. The recognition experiments was conducted using noisy speech corrupted by white Gaussian noise, car noise, babble noise or subway noise. The new technique reduced the average error rate slightly under high SNR(Signal to Noise Ratio) environment, and significantly reduced the average error rate by 1/2 under low SNR(10 dB) environment when compared with the case of without spectral compensations.
PDF KSCI

An Efficient Model Parameter Compensation Method foe Robust Speech Recognition

Chung Yong-Joo
- MALSORI
- /
- no.45
- /
- pp.107-115
- /
- 2003
An efficient method that compensates the HMM parameters for the noisy speech recognition is proposed. Instead of assuming some analytical approximations as in the PMC, the proposed method directly re-estimates the HMM parameters by the segmental k-means algorithm. The proposed method has shown improved results compared with the conventional PMC method at reduced computational cost.
PDF

Noise Robust Automatic Speech Recognition Scheme with Histogram of Oriented Gradient Features

Park, Taejin;Beack, SeungKwan;Lee, Taejin
- IEIE Transactions on Smart Processing and Computing
- /
- v.3 no.5
- /
- pp.259-266
- /
- 2014
In this paper, we propose a novel technique for noise robust automatic speech recognition (ASR). The development of ASR techniques has made it possible to recognize isolated words with a near perfect word recognition rate. However, in a highly noisy environment, a distinct mismatch between the trained speech and the test data results in a significantly degraded word recognition rate (WRA). Unlike conventional ASR systems employing Mel-frequency cepstral coefficients (MFCCs) and a hidden Markov model (HMM), this study employ histogram of oriented gradient (HOG) features and a Support Vector Machine (SVM) to ASR tasks to overcome this problem. Our proposed ASR system is less vulnerable to external interference noise, and achieves a higher WRA compared to a conventional ASR system equipped with MFCCs and an HMM. The performance of our proposed ASR system was evaluated using a phonetically balanced word (PBW) set mixed with artificially added noise.
https://doi.org/10.5573/IEIESPC.2014.3.5.259 인용 PDF KSCI

Robust Speech Recognition with Car Noise based on the Wavelet Filter Banks (웨이블렛 필터뱅크를 이용한 자동차 소음에 강인한 고립단어 음성인식)

Lee, Dae-Jong;Kwak, Keun-Chang;Ryu, Jeong-Woong;Chun, Myung-Geun
- Journal of the Korean Institute of Intelligent Systems
- /
- v.12 no.2
- /
- pp.115-122
- /
- 2002
This paper proposes a robust speech recognition algorithm based on the wavelet filter banks. Since the proposed algorithm adopts a multiple band decision-making scheme, it performs robustness for noise as the presence of noisy severely degrades the performance of speech recognition system. For evaluating the performance of the proposed scheme, we compared it with the conventional speech recognizer based on the VQ for the 10-isolated korean digits with car noise. Here, the proposed method showed more 9~27% improvement of the recognition rate than the conventional VQ algorithm for the various car noisy environments.
https://doi.org/10.5391/JKIIS.2002.12.2.115 인용 PDF KSCI

Search Result 207, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)