Search | Korea Science

A NMF-Based Speech Enhancement Method Using a Prior Time Varying Information and Gain Function (시간 변화에 따른 사전 정보와 이득 함수를 적용한 NMF 기반 음성 향상 기법)

Kwon, Kisoo;Jin, Yu Gwang;Bae, Soo Hyun;Kim, Nam Soo
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.38C no.6
- /
- pp.503-511
- /
- 2013
This paper presents a speech enhancement method using non-negative matrix factorization. In training phase, we can obtain each basis matrix from speech and specific noise database. After training phase, the noisy signal is separated from the speech and noise estimate using basis matrix in enhancement phase. In order to improve the performance, we model the change of encoding matrix from training phase to enhancement phase using independent Gaussian distribution models, and then use the constraint of the objective function almost same as that of the above Gaussian models. Also, we perform a smoothing operation to the encoding matrix by taking into account previous value. Last, we apply the Log-Spectral Amplitude type algorithm as gain function.
https://doi.org/10.7840/kics.2013.38C.6.503 인용 PDF KSCI

Speech enhancement system using the multi-band coherence function and spectral subtraction method (다중 주파수 밴드 간섭함수와 스펙트럼 차감법을 이용한 음성 향상 시스템)

Oh, Inkyu;Lee, Insung
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.4
- /
- pp.406-413
- /
- 2019
This paper proposes a speech enhancement method through the process of combining the gain function with spectrum subtraction method in the two microphone array with close spacing. A speech enhancement method that uses a gain function estimated by the SNR (Signal-to Noise Ratio) based on the multi frequency band coherence function causes the performance degradation in high correlation between input noises of two channels. A new speech enhancement method is proposed where the weighted gain function is used by combining the gain function from the spectral subtraction. The performance evaluation of the proposed method was shown by comparison with PESQ (Perceptual Evaluation of Speech Quality) value which is an objective quality evaluation test provided by the ITU-T (International Telecommunications Union Telecommunication). In the PESQ tests, the maximum 0.217 of PESQ value is improved in the various background noise environments.
https://doi.org/10.7776/ASK.2019.38.4.406 인용 PDF KSCI HTML

Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments (비정상 잡음환경에서 음질향상을 위한 적응 임계 치 알고리즘)

Lee, Soo-Jeong;Kim, Sun-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.7
- /
- pp.386-393
- /
- 2008
This paper proposes a new approach for speech enhancement in highly nonstationary noisy environments. The spectral subtraction (SS) is a well known technique for speech enhancement in stationary noisy environments. However, in real world, noise is mostly nonstationary. The proposed method uses an auto control parameter for an adaptive threshold to work well in highly nonstationary noisy environments. Especially, the auto control parameter is affected by a linear function associated with an a posteriori signal to noise ratio (SNR) according to the increase or the decrease of the noise level. The proposed algorithm is combined with spectral subtraction (SS) using a hangover scheme (HO) for speech enhancement. The performances of the proposed method are evaluated ITU-T P.835 signal distortion (SIG) and the segment signal to-noise ratio (SNR) in various and highly nonstationary noisy environments and is superior to that of conventional spectral subtraction (SS) using a hangover (HO) and SS using a minimum statistics (MS) methods.
https://doi.org/10.7776/ASK.2008.27.7.386 인용 PDF KSCI

Comparison of Two Speech Estimation Algorithms Based on Generalized-Gamma Distribution Applied to Speech Recognition in Car Noisy Environment (자동차 잡음환경에서의 음성인식에 적용된 두 종류의 일반화된 감마분포 기반의 음성추정 알고리즘 비교)

Kim, Hyoung-Gook;Lee, Jin-Ho
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.8 no.4
- /
- pp.28-32
- /
- 2009
This paper compares two speech estimators under a generalized Gamma distribution for DFT-based single-microphone speech enhancement methods. For the speech enhancement, the noise estimation based on recursive averaging spectral values by spectral minimum noise is applied to two speech estimators based on the generalized Gamma distribution using $\kappa$=1 or $\kappa$=2. The performance of two speech enhancement algorithms is measured by recognition accuracy of automatic speech recognition(ASR) in car noisy environment.
PDF

A SPECTRAL SUBTRACTION USING PHONEMIC AND AUDITORY PROPERTIES

Kang, Sun-Mee;Kim, Woo-Il;Ko, Han-Seok
- Speech Sciences
- /
- v.4 no.2
- /
- pp.5-15
- /
- 1998
This paper proposes a speech state-dependent spectral subtraction method to regulate the blind spectral subtraction for improved enhancement. In the proposed method, a modified subtraction rule is applied over the speech selectively contingent to the speech state being voiced or unvoiced, in an effort to incorporate the acoustic characteristics of phonemes. In particular, the objective of the proposed method is to remedy the subtraction induced signal distortion attained by two state-dependent procedures, spectrum sharpening and minimum spectral bound. In order to remove the residual noise, the proposed method employs a procedure utilizing the masking effect. Proposed spectral subtraction including state-dependent subtraction and residual noise reduction using the masking threshold shows effectiveness in compensation of spectral distortion in the unvoiced region and residual noise reduction.
PDF

Spatial Resolution Enhancement with Fiber - based Spectral Filtering for Optical Coherence Tomography

Choi, Eun-Seo;Na, Ji-Hoon;Lee, Byeong-Ha
- Journal of the Optical Society of Korea
- /
- v.7 no.4
- /
- pp.216-223
- /
- 2003
We report a technique that improves the spatial resolution of optical coherence tomography (OCT) by utilizing fiber-based spectral filtering. The proposed technique improves the resolution by filtering out the erbium’s characteristic peak from the amplified spontaneous emission (ASE) source spectrum, and reshaping the spectrum to Gaussian-like. We used a long period fiber grating (LPG) and an erbium doped fiber (EDF) absorber for the spectral filtering. An in-house made ASE source as well as a commercial ASE source [ASE-FL7002] was used as the OCT sources to study the proposed technique. The resolution of the OCT based on an in-house made ASE source is enhanced from 200 to 40 ㎛ with an LPG. While, the resolution of the OCT based on a commercial ASE source is enhanced from 25 to 19 ㎛ with the aid of an EDF absorber. However, sidelobes still exist in the interferogram due to imperfect spectral filtering, which limited the resolution. Further enhancement in the spatial resolution of the OCT system using the ASE source is possible with the aid of cascaded LPGs and/or carefully designed EDF absorber.
https://doi.org/10.3807/JOSK.2003.7.4.216 인용 PDF KSCI

Pre-Processing for Performance Enhancement of Speech Recognition in Digital Communication Systems (디지털 통신 시스템에서의 음성 인식 성능 향상을 위한 전처리 기술)

Seo, Jin-Ho;Park, Ho-Chong
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.7
- /
- pp.416-422
- /
- 2005
Speech recognition in digital communication systems has very low performance due to the spectral distortion caused by speech codecs. In this paper, the spectral distortion by speech codecs is analyzed and a pre-processing method which compensates for the spectral distortion is proposed for performance enhancement of speech recognition. Three standard speech codecs. IS-127 EVRC. ITU G.729 CS-ACELP and IS-96 QCELP. are considered for algorithm development and evaluation, and a single method which can be applied commonly to all codecs is developed. The performance of the proposed method is evaluated for three codecs, and by using the speech features extracted from the compensated spectrum. the recognition rate is improved by the maximum of $15.6\%$ compared with that using the degraded speech features.
PDF KSCI

Improvement of the ASR Robustness using Combinations of Spectral Subtraction and KLT-based Adaptive Comb-filtering (스펙트럴 서브트렉션과 비동기 KLT 잡음 감소 기법의 조합에 의한 음성 인식 성능 개선)

Park Sung-Joon
- Proceedings of the KSPS conference
- /
- 2003.05a
- /
- pp.207-210
- /
- 2003
In this paper, the combinations of speech enhancement techniques are experimented. Specifically, the spectral subtraction, KLT based comb-filtering, and their combinations are applied to the Aurora2 database. The results show that recognition accuracy is improved when KLT based comb-filtering is applied after spectral subtraction.
PDF

Speech Enhancement Using Phase-Dependent A Priori SNR Estimator in Log-Mel Spectral Domain

Lee, Yun-Kyung;Park, Jeon Gue;Lee, Yun Keun;Kwon, Oh-Wook
- ETRI Journal
- /
- v.36 no.5
- /
- pp.721-729
- /
- 2014
We propose a novel phase-based method for single-channel speech enhancement to extract and enhance the desired signals in noisy environments by utilizing the phase information. In the method, a phase-dependent a priori signal-to-noise ratio (SNR) is estimated in the log-mel spectral domain to utilize both the magnitude and phase information of input speech signals. The phase-dependent estimator is incorporated into the conventional magnitude-based decision-directed approach that recursively computes the a priori SNR from noisy speech. Additionally, we reduce the performance degradation owing to the one-frame delay of the estimated phase-dependent a priori SNR by using a minimum mean square error (MMSE)-based and maximum a posteriori (MAP)-based estimator. In our speech enhancement experiments, the proposed phase-dependent a priori SNR estimator is shown to improve the output SNR by 2.6 dB for both the MMSE-based and MAP-based estimator cases as compared to a conventional magnitude-based estimator.
https://doi.org/10.4218/etrij.14.2214.0039 인용 PDF KSCI KPUBS

A Selection Method of Reliable Codevectors using Noise Estimation Algorithm (잡음 추정 알고리즘을 이용한 신뢰성 있는 코드벡터 조합의 선정 방법)

Jung, Seungmo;Kim, Moo Young
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.7
- /
- pp.119-124
- /
- 2015
Speech enhancement has been required as a preprocessor for a noise robust speech recognition system. Codebook-based Speech Enhancement (CBSE) is highly robust in nonstationary noise environments compared with conventional noise estimation algorithms. However, its performance is severely degraded for the codevector combinations that have lower correlation with the input signal since CBSE depends on the trained codebook information. To overcome this problem, only the reliable codevector combinations are selected to be used to remove the codevector combinations that have lower correlation with input signal. The proposed method produces the improved performance compared to the conventional CBSE in terms of Log-Spectral Distortion (LSD) and Perceptual Evaluation of Speech Quality (PESQ).
https://doi.org/10.5573/ieie.2015.52.7.119 인용 PDF KSCI

Search Result 208, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)