DOI QR코드

DOI QR Code

Spectral Shape Invariant Real-time Voice Change System

스펙트럼 형태 불변 실시간 음성 변환 시스템

  • 김원구 (군산대학교 전자정보공학부)
  • Published : 2005.02.01

Abstract

In this paper, the spectral shape invariant real-time voice change method is proposed to change one's voice to mechanical voice. For this purpose, LPC analysis and synthesis is used to maintain the spectraum of voice and the pitch of synthesis speech can be changed freely. In the proposed method, gain matching method is applied to excitation signal generator to make the changed voice natural to hear. In order to evaluate the performance of the proposed method, voice change experiments were conducted. Experimental results showed that original speech signal is changed to the mechanical voice signal in which context of the speaker's voice is conveyed correctly in spite of drastic change of pitch. The system is implemented using TI TMS320C6711DSK board to verify the system runs in real time.

본 논문에서는 음성의 스펙트럼 형태는 유지하면서 음성을 기계적인 음성으로 변환시키기는 실시간 음성 변환 방법을 제안하였다. 이러한 목적을 위하여 LPC 분석 및 합성 방법을 사용하여 변환된 음성의 스펙트럼은 유지하였고 합성된 음성의 피치는 자유롭게 변경되도록 하였다. 제안된 방법에서는 변환된 음성이 보다 자연스럽게 들리게 하기 위하여 여기 신호 발생기에 이득 정합 방법을 적용하였다. 제안된 방법의 성능을 평가하기 위하여 음성 변환 실험을 수행하였다. 실험 결과에서 원 음성 신호는 원 화자의 신원을 알기가 어려운 기계적인 음성 신호로 바뀌는 것을 알 수 있었고 피치의 심한 변화에도 변환된 음성의 의미는 정확히 전달될 수 있었다. 제안된 시스템은 시스템의 실시간으로 구현될 수 있는지 확인하기 위하여 TI TMS320C6711DSK 보드를 사용하여 구현되었다.

Keywords

References

  1. S. Roucos and A. M. Wilgus, 'High quality time-scale modification for speech,' proc. of ICASSP, vol. 1, pp. 493-469, 1985
  2. J. Makhoul and A. C. Jaroudi, 'Time-scale modification in medium to low rate speech coding,' proc. of ICASSP, vol. 1, pp. 1705-1708, 1986
  3. C. Hardam, 'High-quality time scale modification of speech signals using fast synchronized-overlap-add algorithm,' proc. of ICASSP, vol. 1, pp. 409-412, 1990
  4. C. Moulines and F. Charpentier, 'Pitch Synchronous Waveform Processing Techniques for Text-to-speech Synthesis using Diphones,' Speech Communication, vol. 9 (5/6), pp. 453-467, 1990 https://doi.org/10.1016/0167-6393(90)90021-Z
  5. C. Moulines andJ. Laroche, 'Non-parametric techniques for pitch-scale and time-scale modification of speech,' Speech Communication, vol. 16, pp. 175-205, 1995 https://doi.org/10.1016/0167-6393(94)00054-E
  6. R. J. Mcaulay and T. F. Quatieri, 'Speech transformations based on a sinusoidal representation,' IEEE Trans. on Acoustic Speech and Signal Processing, vol. 34, No.1, pp. 1449-1464, December, 1986 https://doi.org/10.1109/TASSP.1986.1164985
  7. T. F. Quatieri and R. J. Mcaulay, 'Shape invariance time-scale & pitch modification of speech,' IEEE Trans. on Acoustic Speech and Signal Processing, vol. 40, No. 3, pp. 497-510, March, 1992 https://doi.org/10.1109/78.120793
  8. T. Takgi and C. Miyasaka, 'A speech prosody conversion system with a high quality speech analysis-synthesis method,' proc. of EUROSPEECH '93, Berlin, pp. 995-998, 1993
  9. J. Laroche, Y. Stylianou and C. Moulines, 'HNS ; speech modification based on a harmonic + noise model,' proc. of ICASSP, vol. 2, pp. 550-553, 1993
  10. M. A. Richards, 'Helium speech enhancement using the short-time fourier transfonn,' IEEE Trans. on Acoustic Speech and Signal Processing, vol. ASSP-30, No.6, pp. 841-853, December, 1982
  11. Il Hyun Nam, 'Voice personality transformation,' Ph. D Thesis, Electrical Engineering Rensselaer Polytechnic Institute, Troy, NY, 1991
  12. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signal, Prentice-Hall Inc., 1978
  13. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signal, Prentice-Hall Inc., 1978
  14. B. S. Atal and S. L. Hanauer, 'Speech Analysis and Synthesis by Linear Prediction of the Speech Wave,' J. Acoust Soc. Am, Vol. 50, No.2, pp. 637-655, 1971 https://doi.org/10.1121/1.1912679
  15. J. Makhoul, 'Linear Prediction : A Tutorial Review', Proc. IEEE, Vol. 63, No 4, April 1975
  16. J. D. Markel and A. H. Gray. Jr, Linear Prediction of Speech, Springer-Verlag, Berlin Heidelberg, New York, 1976