Browse > Article
http://dx.doi.org/10.5391/JKIIS.2007.17.7.957

Sustained Vowel Modeling using Nonlinear Autoregressive Method based on Least Squares-Support Vector Regression  

Jang, Seung-Jin (연세대학교 의공학과)
Kim, Hyo-Min (연세대학교 의공학과)
Park, Young-Choel (연세대학교 컴퓨터통신공학부)
Choi, Hong-Shik (연세대학교 이비인후과)
Yoon, Young-Ro (연세대학교 의공학과)
Publication Information
Journal of the Korean Institute of Intelligent Systems / v.17, no.7, 2007 , pp. 957-963 More about this Journal
Abstract
In this paper, Nonlinear Autoregressive (NAR) method based on Least Square-Support Vector Regression (LS-SVR) is introduced and tested for nonlinear sustained vowel modeling. In the database of total 43 sustained vowel of Benign Vocal Fold Lesions having aperiodic waveform, this nonlinear synthesizer near perfectly reproduced chaotic sustained vowels, and also conserved the naturalness of sound such as jitter, compared to Linear Predictive Coding does not keep these naturalness. However, the results of some phonation are quite different from the original sounds. These results are assumed that single-band model can not afford to control and decompose the high frequency components. Therefore multi-band model with wavelet filterbank is adopted for substituting single band model. As a results, multi-band model results in improved stability. Finally, nonlinear sustained vowel modeling using NAR based on LS-SVR can successfully reconstruct synthesized sounds nearly similar to original voiced sounds.
Keywords
Sustained Vowel Modeling; Least Squares-Support Vector Regression(LS-SVR); Nonlinear Autoregressive Model(NAR); Wavelet; Multi-band;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Klatt, D, Review of text-to-speech conversion for english, J. of Acoust Socof Am., 1987, vol.82, pp. 737-793   DOI
2 J. Mercer, Functions of positive and negative type and their connection with the theory of integral equations, Philos. Trans. Roy. Soc. London 1909
3 Golub, G.H. and C.F. Van Loan, Matrix Computations. John Hopkins University Press, 1989
4 Laver J, Hiller S, Mackenzie J, Rooney E: An acoustic screening system for the detection of laryngeal pathology. J.Phonetics, vol.14, pp.517 -524
5 J.C. Principe, A. Rathie, J.M. Kuo, Prediction of chaotic time series with neural networks and the issue of dynamic modeling, Int. J. Bifurcation Chaos, 1992, vol.2, pp. 989 - 996   DOI
6 C.S. Blackburn, Articulatory Methods for Speech Production and Recognition, PhD Thesis, Cambridge University Engineering Department, 1996
7 Banci G, Monini S, Falaschi A, Sario N: Vocal fold disorder evaluation by digital speech analysis, J. Phonetics,1986, vol.14, pp.495-499
8 Rabiner L. and Juang B. H., Fundamentals of speech recognition, Prentence Hall, NJ, 1993
9 M.R. Petraglia, S.K. Mitra, Performance analysis of adaptive filter structures based on subband decomposition, Proceedings of the IEEE International Symposium on Circuit and Systems, Chicago, IL, 1993, pp. 60 - 63
10 Gavidia-Ceballos L, Hansen L: Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection, IEEE Tr. on Biomedical Eng., 1996, vol. 43, pp.373-383   DOI   ScienceOn
11 B. Schlkopf, A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2001
12 Giovanni A, Robert D, Estubier N, Teston B: Objective evaluation of dysphonia: Preliminary results of a device allowing simultaneous acoustics and aerodynamics measurements. Folia, Phon. Logop
13 R.E. Crochiere, L.R. Rabiner, Multirate Digital Signal Processing, Prentice-Hall, Englewood CliLs, NJ, 1983
14 V. Vapnik, The Nature of Statistical Learning Theory, Springer Verlag, New York, 1995
15 N.J. Fleige, Multirate Digital Signal Processing (Multirate systems, Filter Banks, Wavelet), Wiley, New York, 1994
16 H. Yasukawa, Signal restoration of broad band speech using nonlinear processing, Proceedings of EUSIPCO'96, Trieste, Italy, Sept. 1996