Search | Korea Science

PCA-based Variational Model Composition Method for Roust Speech Recognition with Time-Varying Background Noise (시변 잡음에 강인한 음성 인식을 위한 PCA 기반의 Variational 모델 생성 기법)

Kim, Wooil
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.17 no.12
- /
- pp.2793-2799
- /
- 2013
This paper proposes an effective feature compensation method to improve speech recognition performance in time-varying background noise condition. The proposed method employs principal component analysis to improve the variational model composition method. The proposed method is employed to generate multiple environmental models for the PCGMM-based feature compensation scheme. Experimental results prove that the proposed scheme is more effective at improving speech recognition accuracy in various SNR conditions of background music, compared to the conventional front-end methods. It shows 12.14% of average relative improvement in WER compared to the previous variational model composition method.
https://doi.org/10.6109/jkiice.2013.17.12.2793 인용 PDF KSCI

Variation Analysis of Feature Parameters According to the Channel Distortion of Korean Telephone Digit Speech (한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석)

정성윤;손종목;김민성;배건성
- Proceedings of the IEEK Conference
- /
- 2002.06d
- /
- pp.191-194
- /
- 2002
The final purpose of this paper is the enhancement of speech recognition rate under the matched telephone environment between training data and test data. To analyze the effect by the distortion of the changing telephone channel on every call, MFCC is used as the feature parameter and CMN, RTCN, and RASTA are used as channel compensation techniques. For each case, the variation of feature parameters of all phones is analyzed. And, we find recognition rates according to each compensation method using the continuous HMM recognizer, and examine the relationship between variation and recognition rate.
PDF

Efficient Compensation of Spectral Tilt for Speech Recognition in Noisy Environment (잡음 환경에서 음성인식을 위한 스펙트럼 기울기의 효과적인 보상 방법)

Cho, Jungho
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.17 no.1
- /
- pp.199-206
- /
- 2017
Environmental noise can degrade the performance of speech recognition system. This paper presents a procedure for performing cepstrum based feature compensation to make recognition system robust to noise. The approach is based on direct compensation of spectral tilt to remove effects of additive noise. The noise compensation scheme operates in the cepstral domain by means of calculating spectral tilt of the log power spectrum. Spectral compensation is applied in combination with SNR-dependent cepstral mean compensation. Experimental results, in the presence of white Gaussian noise, subway noise and car noise, show that the proposed compensation method achieves substantial improvements in recognition accuracy at various SNR's.
https://doi.org/10.7236/JIIBC.2017.17.1.199 인용 PDF KSCI

Luminance Compensation using Feature Points and Histogram for VR Video Sequence (특징점과 히스토그램을 이용한 360 VR 영상용 밝기 보상 기법)

Lee, Geon-Won;Han, Jong-Ki
- Journal of Broadcast Engineering
- /
- v.22 no.6
- /
- pp.808-816
- /
- 2017
360 VR video systems has become important to provide immersive effect for viewers. The system consists of stitching, projection, compression, inverse projection, viewport extraction. In this paper, an efficient luminance compensation technique for 360 VR video sequences, where feature extraction and histogram equalization algorithms are utilized. The proposed luminance compensation algorithm enhance the performance of stitching in 360 VR system. The simulation results showed that the proposed technique is useful to increase the quality of the displayed image.
https://doi.org/10.5909/JBE.2017.22.6.808 인용 PDF KSCI KPUBS

A Study on the Noisy Speech Recognition Based on the Data-Driven Model Parameter Compensation (직접데이터 기반의 모델적응 방식을 이용한 잡음음성인식에 관한 연구)

Chung, Yong-Joo
- Speech Sciences
- /
- v.11 no.2
- /
- pp.247-257
- /
- 2004
There has been many research efforts to overcome the problems of speech recognition in the noisy conditions. Among them, the model-based compensation methods such as the parallel model combination (PMC) and vector Taylor series (VTS) have been found to perform efficiently compared with the previous speech enhancement methods or the feature-based approaches. In this paper, a data-driven model compensation approach that adapts the HMM(hidden Markv model) parameters for the noisy speech recognition is proposed. Instead of assuming some statistical approximations as in the conventional model-based methods such as the PMC, the statistics necessary for the HMM parameter adaptation is directly estimated by using the Baum-Welch algorithm. The proposed method has shown improved results compared with the PMC for the noisy speech recognition.
PDF

Scaling-Translation Parameter Estimation using Genetic Hough Transform for Background Compensation

Nguyen, Thuy Tuong;Pham, Xuan Dai;Jeon, Jae-Wook
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.5 no.8
- /
- pp.1423-1443
- /
- 2011
Background compensation plays an important role in detecting and isolating object motion in visual tracking. Here, we propose a Genetic Hough Transform, which combines the Hough Transform and Genetic Algorithm, as a method for eliminating background motion. Our method can handle cases in which the background may contain only a few, if any, feature points. These points can be used to estimate the motion between two successive frames. In addition to dealing with featureless backgrounds, our method can successfully handle motion blur. Experimental comparisons of the results obtained using the proposed method with other methods show that the proposed approach yields a satisfactory estimate of background motion.
https://doi.org/10.3837/tiis.2011.08.004 인용 PDF KSCI

Phase Compensation of Fuzzy Control Systems and Realization of Neuro-fuzzy Compenastors

Tanaka, Kazuo;Sano, Manabu
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1993.06a
- /
- pp.845-848
- /
- 1993
This paper proposes a design method of fuzzy phase-lead compensator and its self-learning by neural network. The main feature of the fuzzy phase-lead compensator is to have parameters for effectively compensating phase characteristics of control systems. An important theorem which is related to phase-lead compensation is derived by introducing concept of frequency characteristics. We propose a design procedure of fuzzy phase-lead compensators for linear controlled objects. Furthermore, we realize a neuro-fuzzy compensator for unknown or nonlinear controlled objects by using Widrow-Hoff learning rule.
PDF

Feature Compensation Method Based on Parallel Combined Mixture Model (병렬 결합된 혼합 모델 기반의 특징 보상 기술)

김우일;이흥규;권오일;고한석
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.7
- /
- pp.603-611
- /
- 2003
This paper proposes an effective feature compensation scheme based on speech model for achieving robust speech recognition. Conventional model-based method requires off-line training with noisy speech database and is not suitable for online adaptation. In the proposed scheme, we can relax the off-line training with noisy speech database by employing the parallel model combination technique for estimation of correction factors. Applying the model combination process over to the mixture model alone as opposed to entire HMM makes the online model combination possible. Exploiting the availability of noise model from off-line sources, we accomplish the online adaptation via MAP (Maximum A Posteriori) estimation. In addition, the online channel estimation procedure is induced within the proposed framework. For more efficient implementation, we propose a selective model combination which leads to reduction or the computational complexities. The representative experimental results indicate that the suggested algorithm is effective in realizing robust speech recognition under the combined adverse conditions of additive background noise and channel distortion.
PDF KSCI

Analysis of Feature Parameter Variation for Korean Digit Telephone Speech according to Channel Distortion and Recognition Experiment (한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석 및 인식실험)

Jung Sung-Yun;Son Jong-Mok;Kim Min-Sung;Bae Keun-Sung
- MALSORI
- /
- no.43
- /
- pp.179-188
- /
- 2002
Improving the recognition performance of connected digit telephone speech still remains a problem to be solved. As a basic study for it, this paper analyzes the variation of feature parameters of Korean digit telephone speech according to channel distortion. As a feature parameter for analysis and recognition MFCC is used. To analyze the effect of telephone channel distortion depending on each call, MFCCs are first obtained from the connected digit telephone speech for each phoneme included in the Korean digit. Then CMN, RTCN, and RASTA are applied to the MFCC as channel compensation techniques. Using the feature parameters of MFCC, MFCC+CMN, MFCC+RTCN, and MFCC+RASTA, variances of phonemes are analyzed and recognition experiments are done for each case. Experimental results are discussed with our findings and discussions
PDF

Harmonics-based Spectral Subtraction and Feature Vector Normalization for Robust Speech Recognition

Beh, Joung-Hoon;Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
- Speech Sciences
- /
- v.11 no.1
- /
- pp.7-20
- /
- 2004
In this paper, we propose a two-step noise compensation algorithm in feature extraction for achieving robust speech recognition. The proposed method frees us from requiring a priori information on noisy environments and is simple to implement. First, in frequency domain, the Harmonics-based Spectral Subtraction (HSS) is applied so that it reduces the additive background noise and makes the shape of harmonics in speech spectrum more pronounced. We then apply a judiciously weighted variance Feature Vector Normalization (FVN) to compensate for both the channel distortion and additive noise. The weighted variance FVN compensates for the variance mismatch in both the speech and the non-speech regions respectively. Representative performance evaluation using Aurora 2 database shows that the proposed method yields 27.18% relative improvement in accuracy under a multi-noise training task and 57.94% relative improvement under a clean training task.
PDF

Search Result 143, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)