DOI QR코드

DOI QR Code

Vocal Enhancement for Improving the Performance of Vocal Pitch Detection

보컬 피치 검출의 성능 향상을 위한 보컬 강화 기술

  • Received : 2011.04.29
  • Accepted : 2011.07.29
  • Published : 2011.08.31

Abstract

This paper proposes a vocal enhancement technique for improving the performance of vocal pitch detection in polyphonic music signal. The proposed vocal enhancement technique predicts an accompaniment signal from the input signal and generates an accompaniment replica signal according to the vocal power. Then, it removes the accompaniment replica signal from the input signal, resulting in a vocal-enhanced signal. The performance of the proposed method was measured by applying the same vocal pitch extraction method to the original and the vocal-enhanced signal, and the vocal pitch detection accuracy was increased by 7.1 % point in average.

본 논문에서는 다성 음악 신호의 보컬 피치 검출 성능을 향상시키기 위해 음악 신호의 보컬 신호를 강화시키는 전처리 기술을 제안한다. 제안한 보컬 강화 기술은 입력된 다성 음악 신호로부터 반주 신호를 예측하고, 예측된 반주 신호를 입력된 보컬 신호의 크기에 맞춰 가공하여 반주 복사본 신호를 생성한다. 마지막으로 주파수 영역에서 반주 복사본 신호를 원래 다성 음악 신호에서 제거하여 보컬이 강화된 출력 신호를 생성한다. 원 음악 신호와 제안한 방법으로 보컬이 강화된 신호에 동일한 보컬 피치 검출 방법을 각각 적용하여 피치 검출의 정확도를 측정하였고, 제안한 기술에 의하여 피치 검출 정확도가 평균 7.1 % 포인트 향상된 것을 확인하였다.

Keywords

References

  1. Yipeng Li and DeLiang Wang, "Detecting pitch of singing voice in polyphonic audio," IEEE Conf.Acoustics, Speech, and Signal Processing, vol.3, pp.17-20, 2005.
  2. Jean-Louis Durrieu, Gael Richard and Bertrand David, "Singer melody extraction in polyphonic signals using source separation methods," IEEE Conf.Acoustics, Speech, and Signal Processing, vol.43, no.4, pp.169-172, 2008.
  3. Masataka Goto, Takeshi Saitou, Tomoyasu Nakano and Hiromasa Fujihara, "Singing Information Processing based on singing voice modeling," IEEE Conf.Acoustics, Speech, and Signal Processing, pp.5506-5509, 2010.
  4. Vishweshwara Rao and Preeti Rao, "Vocal melody extraction in the presence of pitched accompaniment in polyphonic music," IEEE Trans.Audio, Speech, and Language Processing, vol.18, pp.2145-2154, 2010. https://doi.org/10.1109/TASL.2010.2042124
  5. Anssi Klapuri, "Multipitch Analysis of Polyphonic Music and Speech Signals Using an Auditory Model." IEEE Trans.Audio, Speech, and Language Processing, vol.16, pp.255-266, 2008. https://doi.org/10.1109/TASL.2007.908129
  6. N.Ono, K.Miyamoto, J.Le Roux, H.Kameoka and S. Sagayama "Separation of a monaural audio signals into harmonic/percussive components by complementary diffusion on spectrogram," Processings of EUSIPCO, 2008.
  7. TIA/EIA/IS-127, Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems, Jan.1997.
  8. S.F.Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans.Acoustics, Speech, Signal Processing, vol.27, pp.113-120, 1979. https://doi.org/10.1109/TASSP.1979.1163209
  9. http://labrosa.ee.columbia.edu/projects/melody
  10. Yipeng Li and DeLiang Wang, "Separation of singing voice from music accompaniment for monaural recording," IEEE Trans. Audio, Speech, and Language Processing, vol.15, pp. 1475-1487, 2007. https://doi.org/10.1109/TASL.2006.889789
  11. Sen Zhang, "An energy-based adaptive voice detection approach," Proc.8th International Conf.Signal Processing, vol.1, pp.1109-1113, 2006.