• Title/Summary/Keyword: speech enhancement

Search Result 340, Processing Time 0.038 seconds

On-line noise coherence estimation algorithm for binaural speech enhancement system (양이형 음성 음질개선 시스템을 위한 온라인 잡음 상관도 추정 알고리즘)

  • Ji, Youna;Baek, Yong-hyun;Park, Young-cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.3
    • /
    • pp.234-242
    • /
    • 2016
  • In this paper, an on-line noise coherence estimation algorithm for binaural speech enhancement system is proposed. A number of noise Power Spectral Density (PSD) estimation algorithms based on the noise coherence between two microphones have been proposed to improve the speech enhancement performance. In the conventional algorithms, the noise coherence was characterized using a real-valued analytic model. However, unlike the analytic model, the noise coherence between the two microphones is time-varying in real environments. Thus, in this paper, the noise coherence is updated in accordance with the variation of the acoustic environment to track the realistic noise coherence. The noise coherence can be updated only during the absence of speech, and the simulation results demonstrate the superiority of the proposed algorithm over the conventional algorithms based on the analytic model.

Performance Enhancement of Speech Intelligibility in Communication System Using Combined Beamforming (directional microphone) and Speech Filtering Method (방향성 마이크로폰과 음성 필터링을 이용한 통신 시스템의 음성 인지도 향상)

  • Shin, Min-Cheol;Wang, Se-Myung
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2005.05a
    • /
    • pp.334-337
    • /
    • 2005
  • The speech intelligibility is one of the most important factors in communication system. The speech intelligibility is related with speech to noise ratio. To enhance the speech to noise ratio, background noise reduction techniques are being developed. As a part of solution to noise reduction, this paper introduces directional microphone using beamforming method and speech filtering method. The directional microphone narrows the spatial range of processing signal into the direction of the target speech signal. The noise signal located in the same direction with speech still remains in the processing signal. To sort this mixed signal into speech and noise, as a following step, a speech-filtering method is applied to pick up only the speech signal from the processed signal. The speech filtering method is based on the characteristics of speech signal itself. The combined directional microphone and speech filtering method gives enhanced performance to speech intelligibility in communication system.

  • PDF

A Study of Acoustic Masking Effect from Formant Enhancement in Digital Hearing Aid (디지털 보청기에서의 포먼트 강조에 의한 마스킹 효과 연구)

  • Jeon, Yu-Yong;Kil, Se-Kee;Yoon, Kwang-Sub;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.45 no.5
    • /
    • pp.13-20
    • /
    • 2008
  • Although digital hearing aid algorithms have been developed to compensate hearing loss and to help hearing impaired people to communicate with others, digital hearing aid user still complain about difficulty of hearing the speech. The reason could be the quality of speech through digital hearing aid is insufficient to understand the speech caused by feedback, residual noise and etc. And another thing is masking effect among formants that makes sound quality low. In this study, we measured the masking characteristics of normal listeners and hearing impaired listeners having presbyacusis to confirm masking effect in speech itself. The experiment is composed of 5 tests; pure tone test, speech reception threshold (SRT) test, word recognition score (WRS) test, puretone masking test and speech masking test. In speech masking test, there are 25 speeches in each speech set. And log likelihood ratio (LLR) is introduced to evaluate the distortion of each speech objectively. As a result, the speech perception became lower by increasing the quantity of formant enhancement. And each enhanced speech in a speech set has statistically similar LLR, however speech perception is not. It means that acoustic masking effect rather than distortion influences speech perception. In actuality, according to the result of frequency analysis of the speech that people can not answer correctly, level difference between first formant and second formant is about 35dB, and it is similar to result of pure tone masking test(normal hearing subject:36.36dB, hearing impaired subject:32.86dB). Characteristics of masking effect is not similar between normal listeners and hearing impaired listeners. So it is required to check the characteristics of masking effect before wearing a hearing aid and to apply this characteristics to fitting.

Effects on the Speech Enhancement Algorithms for Sensorineural Hearing Impairment and Normal Listeners (배경잡음하에서의 감음신경성난청과 정상청력자의 어음인지향상 연구)

  • Kim, D.W.;Kim, I.Y.;Youn, G.W.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1998 no.11
    • /
    • pp.171-172
    • /
    • 1998
  • Recent development of digital technology has offered new possibilities for noticeable advances of hearing aids. Using the digital technology, it is possible to equip hearing aids with powerful features such as multi-channel nonlinear compression amplification and the feedback cancellation, these are often difficult to implement with analog circuits. Still, speech in noise is one of the major complaints of not only hearing impaired persons but also normal listeners. This paper describes speech intelligibility in background noise for both normal and hearing impaired listeners. Speech enhancement algorithms were implemented and compared for normal and sensorineural hearing impairment listeners.

  • PDF

Speech Enhancement with Decomposition into Deterministic and Stochastic components and Psychoacoustic Model (결정적/확률적 요소로의 음성 분해와 심리음향 모델 기반 잡음 제거 기법)

  • Jo, Seok-Hwan;Yoo, Chang-D.
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.301-302
    • /
    • 2007
  • A speech enhancement algorithm based on both a decomposition of speech into deterministic and stochastic components and a psychoacoustic model is proposed. Noisy speech is decomposed into deterministic and stochastic components, and then each component is enhanced preserving its individual characteristics. A psychoacoustic model is taken into account when enhancing the stochastic component. Simulation results show that the proposed algorithm performs better than some of the more popular algorithms.

  • PDF

Speech Enhancement Based on Psychoacoustic Model

  • Lee, Jingeol;Kim, Soowon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3E
    • /
    • pp.12-18
    • /
    • 2000
  • Psychoacoustic model based methods have recently been introduced in order to enhance speech signals corrupted by ambient noise. In particular, the perceptual filter is analytically derived where the frequency content of the input noisy signal is made the same as that of the estimated clean signal in auditory domain. However, the analytical derivation should rely on the deconvolution associated with the spreading function in the psychoacoustic model, which results in an ill-conditioned problem. In order to cope with the problem associated with the deconvolution, we propose a novel psychoacoustic model based speech enhancement filter whose principle is the same as the perceptual filter, however the filter is derived by a constrained optimization which provides solutions to the ill-conditioned problem. It is demonstrated with artificially generated signals that the proposed filter operates according to the principle. It is shown that superior performance results from the proposed filter over the perceptual filter provided that a clean speech signal is separable from noise.

  • PDF

Rao-Blackwellized Particle Filtering for Sequential Speech Enhancement (Rao-Blackwellized particle filter를 이용한 순차적 음성 강조)

  • Park Sun-Ho;Choi Seun-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.151-153
    • /
    • 2006
  • we present a method of sequential speech enhancement, where we infer clean speech signal using a Rao-Blackwellized particle filter (RBPF), given a noise-contaminated observed signal. In contrast to Kalman filtering-based methods, we consider a non-Gaussian speech generative model that is based on the generalized auto-regressive (GAR) model. Model parameters are learned by a sequential Newton-Raphson expectation maximization (SNEM), incorporating the RBPF. Empirical comparison to Kalman filter, confirms the high performance of the proposed method.

  • PDF

A Study on Acoustic Masking Effect by Frame-Based Formant Enhancement (프레임 기반의 포먼트 강조에 의한 음향 마스킹 현상 발생에 대한 연구)

  • Jeon, Yu-Yong;Kim, Kyu-Sung;Lee, Sang-Min
    • Journal of Biomedical Engineering Research
    • /
    • v.30 no.6
    • /
    • pp.529-534
    • /
    • 2009
  • One of the characteristics of the hearing impaired is that their frequency selectivity is poorer than that of the normal hearing. To compensate this, formant enhancement algorithms and spectral contrast enhancement algorithms have been developed. However in some cases, these algorithms fail to improve the frequency selectivity of the hearing impaired. One of the reasons is the acoustic masking among enhanced formants. In this study, we tried to enhance the formants based on the individual masking characteristic of each subject. The masking characteristic used in this study was minimum level difference (MLD) between the first formant to the second formant while acoustic masking was occurred. If the level difference between the two formants in each frame is larger than the MLD, the gain of the first formant was decreased to reduce the acoustic masking that occurred among formants. As a result of the speech discrimination test, using formant enhanced speeches, speech discrimination score (SDS) of the speeches having differently enhanced formants was significantly superior to SDS of the speeches having equally enhanced formants. It means that suppression of the acoustic masking among formants improve frequency selectivity of the hearing impaired.

Intelligibility Improvement Benefit of Clear Speech and Korean Stops

  • Kang, Kyoung-Ho
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.3-11
    • /
    • 2010
  • The present study confirmed the intelligibility improvement benefit of clear speech by investigating the intelligibility of Korean stops produced in different speaking styles: conversational, citation-form, and clear speech. This finding supports the Hypo- & Hyper-speech theory that speakers adjust vocal effort to accommodate hearers' speech perception difficulty. A progressive intelligibility improvement was found for the three speaking styles investigated: clear speech was more intelligible than citation-form speech citation-form speech was more intelligible than conversational speech and clear speech was also more intelligible than conversational speech. These findings suggest that the manipulations to elicit three distinct speaking styles in a laboratory setting were successful. Korean lenis stops showed the least intelligibility improvement among the three Korean stop types, and this result suggests that lenis stops should be more resistant to intelligibility enhancement efforts in clear speech than aspirated and fortis stops.

  • PDF

A Noise Robust Speech Recognition Method Using Model Compensation Based on Speech Enhancement (음성 개선 기반의 모델 보상 기법을 이용한 강인한 잡음 음성 인식)

  • Shen, Guang-Hu;Jung, Ho-Youl;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.4
    • /
    • pp.191-199
    • /
    • 2008
  • In this paper, we propose a MWF-PMC noise processing method which enhances the input speech by using Mel-warped Wiener Filtering (MWF) at pre-processing stage and compensates the recognition model by using PMC (Parallel Model Combination) at post-processing stage for speech recognition in noisy environments. The PMC uses the residual noise extracted from the silence region of enhanced speech at pre-processing stage to compensate the clean speech model and thus this method is considered to improve the performance of speech recognition in noisy environments. For recognition experiments we dew.-sampled KLE PBW (Phoneme Balanced Words) 452 word speech data to 8kHz and made 5 different SNR levels of noisy speech, i.e., 0dB. 5dB, 10dB, 15dB and 20dB, by adding Subway, Car and Exhibition noise to clean speech. From the recognition results, we could confirm the effectiveness of the proposed MWF-PMC method by obtaining the improved recognition performances over all compared with the existing combined methods.