Browse > Article
http://dx.doi.org/10.13064/KSSS.2020.12.4.091

Classification of muscle tension dysphonia (MTD) female speech and normal speech using cepstrum variables and random forest algorithm  

Yun, Joowon (Department of Speech & Language Pathology, Chungnam National University)
Shim, Heejeong (Division of Speech Pathology & Audiology, Hallym University)
Seong, Cheoljae (Department of Speech & Language Pathology, Chungnam National University)
Publication Information
Phonetics and Speech Sciences / v.12, no.4, 2020 , pp. 91-98 More about this Journal
Abstract
This study investigated the acoustic characteristics of sustained vowel /a/ and sentence utterance produced by patients with muscle tension dysphonia (MTD) using cepstrum-based acoustic variables. 36 women diagnosed with MTD and the same number of women with normal voice participated in the study and the data were recorded and measured by ADSVTM. The results demonstrated that cepstral peak prominence (CPP) and CPP_F0 among all of the variables were statistically significantly lower than those of control group. When it comes to the GRBAS scale, overall severity (G) was most prominent, and roughness (R), breathiness (B), and strain (S) indices followed in order in the voice quality of MTD patients. As these characteristics increased, a statistically significant negative correlation was observed in CPP. We tried to classify MTD and control group using CPP and CPP_F0 variables. As a result of statistic modeling with a Random Forest machine learning algorithm, much higher classification accuracy (100% in training data and 83.3% in test data) was found in the sentence reading task, with CPP being proved to be playing a more crucial role in both vowel and sentence reading tasks.
Keywords
muscle tension dysphonia (MTD); cepstral peak prominence (CPP); CPP_F0; sentence reading task; Random Forest; machine learning; CSID(cepstral spectral index of dysphonia);
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Heman-Ackah, Y., Heuer, R., Michael, D., Ostrowski, R., Horman, M., Baroody, M., Hillenbrand, J., & Sataloff, R. (2003). Cepstral peak prominence: A more reliable measure of dysphonia. Annals of Otology, Rhinogogy & Laryngology, 112(4), 324-333.   DOI
2 Lowell, S. Y., Kelley, R. T., Awan, S. N., Colton, R. H., & Chan, N. H. (2012). Spectral- and cepstral-based acoustic features of dysphonic, strained voice quality. Annals of Otology, Rhinology & Laryngology, 121(8), 539-548.   DOI
3 Noh, S. H., Kim, S. Y., Cho, J. K., Lee, S. H., & Jin, S. M. (2017). Differentiation of adductor-type spasmodic dysphonia from muscle tension dysphonia using spectrogram. Journal of Korean Society of Laryngology, Phoniatrics and Logopedics, 28(2), 100-105.   DOI
4 Park, J. H. (2011). A study on aspects of vocal cord vibration and acoustic characteristics according to types of muscle tension dysphonia (Master's thesis). Daegu University, Daegu, Korea.
5 Peterson, E. A., Roy, N., Awan, S. N., Merrill, R. M., Banks, R., & Tanner, K. (2013). Toward validation of the cepstral spectral index of dysphonia (CSID) as an objective treatment outcomes measure. Journal of Voice, 27(4), 401-410.   DOI
6 Pyo, H. Y., & Shim, H. S. (2007). A study for the development of Korean voice assessment model for the patients with voice disorders: A qualitative study. Phonetics and Speech Sciences, 14(2), 7-22.
7 Rubin, J. S., Sataloff, R. T., & Korovin, G. S. (2006). Diagnosis and treatment of voice disorders. San Diego, CA: Plural.
8 Seo, I. (2014). Acoustic measures of voice quality and phonation types across speech conditions in dysarthria (Doctoral dissertation). Chungnam National University, Chungnam, Korea.
9 Seo, I. H., & Lee, O. B. (2015). Cepstral and spectral analysis of whispery voice by healthy adults: Preliminary study. Journal of Speech-Language & Hearing Disorders, 24(4), 259-266.   DOI
10 Seo, I. H., & Seong, C. J. (2013). Voice quality of dysarthric speakers in connected speech. Journal of the Korean Society of Speech Science, 5(4), 33-41.
11 Shim, H. J., Jang, H. R., Shin, H. B., & Ko, D. H. (2015). Cepstral, spectral and time-based analysis of voices of esophageal speakers. Folia Phoniatrica et Logopaedica, 67(2), 90-96.   DOI
12 Shim, H. J., Jung, H., Lee, S. A., Choi, B. H., Heo, J. H., & Ko, D. H. (2016a). Cepstral and spectral analaysis of voices with adductor spasmodic dysphonia. Phonetics and Speech Sciences, 8(2), 73-80.   DOI
13 Shin, H. B., Shim, H. J., Jung, H., Ko, D. H. (2018). Characteristics of voice quality on clear versus casual speech in individuals with Parkinson's disease. Phonetics and Speech Sciences, 10(2), 77-84.   DOI
14 Watts, C., & Awan, S. (2011). Use of spectral/cepstral analyses for differentiating normal from hypofunctional voices in sustained vowel and continuous speech contexts. Journal of Speech, Language, and Hearing Research, 54(6), 1525-1537.   DOI
15 Yu, M., Choi, S. H., Choi, C. H., & Choi, B. (2018). Predicting normal and pathological voice using a cepstral based acoustic index in sustained vowels versus connected speech. Communication Sciences & Disorders, 23(4), 1055-1064.   DOI
16 Awan, S. N., & Roy, N. (2009). Outcomes measurement in voice disorders: Application of an acoustic index of dysphonia severity. Journal of Speech, Language & Hearing Research, 52(2), 482-499.   DOI
17 Alharbi, G. G., Cannito, M. P., Buder, E. H., & Awan, S. N. (2019). Spectral/cepstral analyses of phonation in Parkinson's disease before and after voice treatment: A preliminary study. Folia Phoniatrica et Logopaedica, 71(5-6), 275-285.   DOI
18 Altman, K. W., Atkinson, C., & Lazarus, C. (2005). Current and emerging concepts in muscle tension dysphonia: A 30-month review. Journal of Voice, 19(2), 261-267.   DOI
19 Awan, S. N., & Roy, N. (2006). Toward the development of an objective index of dysphonia severity: A four-factor acoustic model. Clinical Linguistics & Phonetics, 20(1), 35-49.   DOI
20 Awan, S. N., Roy, N., & Cohen, S. M. (2014). Exploring the relationship between spectral and cepstral measures of voice and the Voice Handicap Index (VHI). Journal of Voice, 28(4), 430-439.   DOI
21 Awan, S. N., Roy, N., Zhang, D., & Cohen, S. M. (2016). Validation of the cepstral spectral index of dysphonia (CSID) as a screening tool for voice disorders: Development of clinical cutoff scores. Journal of Voice, 30(2), 130-144.   DOI
22 Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.   DOI
23 Hillenbrand, J., Cleveland, R. A., & Erickson, R. L. (1994). Acoustic correlates of breathy vocal quality. Journal of Speech, Language, and Hearing Research, 37(4), 769-778.   DOI
24 Choi, S.H., & Choi, C. H. (2016). The effect of gender and speech task on cepstral- and spectral-measures of Korean normal speakers. Audiology and Speech Research, 12(3), 157-163.   DOI
25 Heman-Ackah, Y. D., Michael, D., & Goding, G. (2002). The relationship between cepstral peak prominence and selected parameters of dysphonia. Journal of Voice, 16(1), 20-27.   DOI
26 Hillenbrand, J., & Houde, R. (1996). Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech. Journal of Speech, Language, and Hearing Research, 39(2), 311-321.   DOI
27 Jalalinajafabadi, F., Gadepalli, C., Ascott, F., Homer, J., Lujan, M., & Cheetham, B. (2013, November). Perceptual evaluation of voice quality and its correlation with acoustic measurement. Proceedings of the 2013 European Modelling Symposium (pp. 283-286). Manchester, UK.
28 Kim, G. H., Lee, Y. W., Park, H. J., Bae, I. H., & Kwon, S. B. (2017). A study of cepstral peak prominence characteristics in ADSV, speech tool and praat. Journal of Speech-Language & Hearing Disorders, 26(3), 99-111.   DOI
29 Kim, N. S., & Seong, C. J. (2017). The acoustic characteristics and classification variables of two Hyponasal groups. The Linguistic Society of Korea, 78, 31-61.
30 Koufman, J. A., & Blalock, P. D. (1988). Vocal fatigue and dysphonia in the professional voice user: Bogart-bacall syndrome. The Laryngoscope, 98(5), 493-498.
31 Kumar, B., Bhat, J., & Prasad, N. (2010). Cepstral analysis of voice in persons with vocal nodules. Journal of Voice, 24(6), 651-653.   DOI