• Title/Summary/Keyword: Vocal enhancement

Search Result 5, Processing Time 0.02 seconds

Vocal Enhancement for Improving the Performance of Vocal Pitch Detection (보컬 피치 검출의 성능 향상을 위한 보컬 강화 기술)

  • Lee, Se-Won;Song, Chai-Jong;Lee, Seok-Pil;Park, Ho-Chong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.6
    • /
    • pp.353-359
    • /
    • 2011
  • This paper proposes a vocal enhancement technique for improving the performance of vocal pitch detection in polyphonic music signal. The proposed vocal enhancement technique predicts an accompaniment signal from the input signal and generates an accompaniment replica signal according to the vocal power. Then, it removes the accompaniment replica signal from the input signal, resulting in a vocal-enhanced signal. The performance of the proposed method was measured by applying the same vocal pitch extraction method to the original and the vocal-enhanced signal, and the vocal pitch detection accuracy was increased by 7.1 % point in average.

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

Intelligibility Improvement Benefit of Clear Speech and Korean Stops

  • Kang, Kyoung-Ho
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.3-11
    • /
    • 2010
  • The present study confirmed the intelligibility improvement benefit of clear speech by investigating the intelligibility of Korean stops produced in different speaking styles: conversational, citation-form, and clear speech. This finding supports the Hypo- & Hyper-speech theory that speakers adjust vocal effort to accommodate hearers' speech perception difficulty. A progressive intelligibility improvement was found for the three speaking styles investigated: clear speech was more intelligible than citation-form speech citation-form speech was more intelligible than conversational speech and clear speech was also more intelligible than conversational speech. These findings suggest that the manipulations to elicit three distinct speaking styles in a laboratory setting were successful. Korean lenis stops showed the least intelligibility improvement among the three Korean stop types, and this result suggests that lenis stops should be more resistant to intelligibility enhancement efforts in clear speech than aspirated and fortis stops.

  • PDF

A Study on A Multi-Pulse Linear Predictive Filtering And Likelihood Ratio Test with Adaptive Threshold (멀티 펄스에 의한 선형 예측 필터링과 적응 임계값을 갖는 LRT의 연구)

  • Lee, Ki-Yong;Lee, Joo-Hun;Song, Iick-Ho;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.20-29
    • /
    • 1991
  • A fundamental assumption in conventional linear predictive coding (LPC) analysis procedure is that the input to an all-pole vocal tract filter is white process. In the case of periodic inputs, however, a pitch bias error is introduced into the conventional LP coefficient. Multi-pulse (MP) LP analysis can reduce this bias, provided that an estimate of the excitation is available. Since the prediction error of conventional LP analysis can be modeled as the sum of an MP excitation sequence and a random noise sequence, we can view extracting MP sequences from the prediction error as a classical detection and estimation problem. In this paper, we propose an algorithm in which the locations and amplitudes of the MP sequences are first obtained by applying a likelihood ratio test (LRT) to the prediction error, and LP coefficients free of pitch bias are then obtained from the MP sequences. To verify the performance enhancement, we iterate the above procedure with adaptive threshold at each step.

  • PDF

Speech Dereverberation using Improved Linear Prediction Residual (개선된 선형예측 잔여를 이용한 음성의 잔향음 제거)

  • Park, Chan-Sub;Kim, Ki-Man;Kang, Suk-Youb
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.10
    • /
    • pp.1845-1851
    • /
    • 2007
  • Background noise and room reverberation are two causes of degradation in speech in listening situations. Many algorithms developed to enhance reverberant speech. In this paper we propose a dereverberation method for enhancement of speech using modified the linear prediction(LP) residual in reverberant room condition. The proposed dereberberation method based on the fact that the signification excitation of the vocal tract system takes place at the instant of glottal closure in voiced speech. Our method used delay information form each sensor, and we need reverberant signals from 3 sensors. We obtain a new LP residual signal using modified IP residual combination which derived form weighting of the LP residual and the Hilbert transform of LP residual. The nature of the coherently added Hilbert envelop has several large amplitude spikes because of the effects of noise and reverberation. This residual of the clean speech is used to excite the time-varying all-pole filter to obtain the enhanced speech. We achieved simulation of proposed algorithm for performance analysis in reverberation environment. The proposed algorithm improves substantially the quality of reverberant speech.