Browse > Article

Automatic severity classification of dysarthria using voice quality, prosody, and pronunciation features  

Yeo, Eun Jung (Department of Linguistics, Seoul National University)
Kim, Sunhee (Department of French Language Education, Seoul National University)
Chung, Minhwa (Department of Linguistics, Seoul National University)
Publication Information
Phonetics and Speech Sciences / v.13, no.2, 2021 , pp. 57-66 More about this Journal
This study focuses on the issue of automatic severity classification of dysarthric speakers based on speech intelligibility. Speech intelligibility is a complex measure that is affected by the features of multiple speech dimensions. However, most previous studies are restricted to using features from a single speech dimension. To effectively capture the characteristics of the speech disorder, we extracted features of multiple speech dimensions: voice quality, prosody, and pronunciation. Voice quality consists of jitter, shimmer, Harmonic to Noise Ratio (HNR), number of voice breaks, and degree of voice breaks. Prosody includes speech rate (total duration, speech duration, speaking rate, articulation rate), pitch (F0 mean/std/min/max/med/25quartile/75 quartile), and rhythm (%V, deltas, Varcos, rPVIs, nPVIs). Pronunciation contains Percentage of Correct Phonemes (Percentage of Correct Consonants/Vowels/Total phonemes) and degree of vowel distortion (Vowel Space Area, Formant Centralized Ratio, Vowel Articulatory Index, F2-Ratio). Experiments were conducted using various feature combinations. The experimental results indicate that using features from all three speech dimensions gives the best result, with a 80.15 F1-score, compared to using features from just one or two speech dimensions. The result implies voice quality, prosody, and pronunciation features should all be considered in automatic severity classification of dysarthria.
dysarthria; automatic severity classification; speech dimensions; machine learning; feature selection;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Jadoul, Y., Thompson, B., & de Boer, B. (2018). Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics, 71, 1-15.   DOI
2 Kang, Y. A., Yoon, K. C., Lee, H. S., & Seong, C. J. (2010). A comparison of parameters of acoustic vowel space in patients with Parkinson's disease. Phonetics and Speech Sciences, 2(4), 185-192.
3 Kim, M. J., & Kim, H. (2012, September). Combination of multiple speech dimensions for automatic assessment of dysarthric speech intelligibility. Proceedings of the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH). Portland, OR.
4 Kim, S., Kim, J. H., & Ko, D. H. (2014). Characteristics of vowel space and speech intelligibility in patients with spastic dysarthria. Communication Sciences & Disorders, 19(3), 352-360.   DOI
5 Lansford, K. L., & Liss, J. M. (2014). Vowel acoustics in dysarthria: Speech disorder diagnosis and classification. Journal of Speech, Language, and Hearing Research, 57(1), 57-67.   DOI
6 Narendra, N. P., & Alku, P. (2021). Automatic assessment of intelligibility in speakers with dysarthria from coded telephone speech using glottal features. Computer Speech & Language, 65, 101117.   DOI
7 Whitehill, T. L., & Ciocca, V. (2000). Speech errors in Cantonese speaking adults with cerebral palsy. Clinical Linguistics & Phonetics, 14(2), 111-130.   DOI
8 Mairano, P., & Romano, A. (2010). Un confronto tra diverse metriche ritmiche usando Correlatore. In S. Schmid, M. Schwarzenbach, & D. Studer (Eds.), La dimensione temporale del parlato (pp. 79-100). Torriana, Italy: EDK.
9 Lee, Y. M., Sung, J. E., Sim, H. S., Han, J. H., & Song, H. N. (2012). Analysis of articulation error patterns depending on the level of speech intelligibility in adults with dysarthria. The Korean Academy of Speech-Language Pathology and Audiology, 17(1), 130-142.
10 Kim, M. J., Kim, Y., & Kim, H. (2015). Automatic intelligibility assessment of dysarthric speech using phonologically-structured sparse linear model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(4), 694-704.   DOI
11 Hernandez, A., Kim, S., & Chung, M. (2020). Prosody-based measures for automatic severity assessment of dysarthric speech. Applied Sciences, 10(19), 6999.   DOI
12 Lee, E., & Kim, J. (2012). Correlation of speech rate changes on intelligibility and acceptability in dysarthric speakers. Journal of Speech-language & Hearing Disorders, 21(3), 127-144.   DOI
13 Narendra, N. P., & Alku, P. (2018, September). Dysarthric Speech Classification Using Glottal Features Computed from Non-words, Words and Sentences. Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2020) (pp. 3403-3407). Hyderabad, India.
14 Seo, I., & Seong, C. (2013). Voice quality of dysarthric speakers in connected speech. Phonetics and Speech Sciences, 5(4), 33-41.   DOI
15 Bhat, C., & Strik, H. (2020). Automatic assessment of sentence-level dysarthria intelligibility using BLSTM. IEEE Journal of Selected Topics in Signal Processing, 14(2), 322-330.   DOI
16 Choi, D. L., Kim, B. W., Kim, Y. W., Lee, Y. J., Um, Y., & Chung, M. (2012, May). Dysarthric speech database for development of QoLT software technology. Proceedings of the 8th International Conference on Language Resources and Evaluation (pp. 3378-3381). Istanbul, Turkey.
17 Seo I. H. (2014). Acoustic measures of voice quality and phonation types across speech conditions in dysarthria (Doctoral dissertation). Chungnam National University, Daejeon, Korea.
18 McFee, B., Colin, R., Dawen, L., Ellis, D. P. W., McVicar, M., Battenberg, E., & Nieto, O. (2015, July). Librosa: Audio and music signal analysis in Python. Proceedings of the 14th Python in Science Conference (pp. 18-25). Austin, TX.
19 Clarke, W. M., & Hoops, H. R. (1980). Predictive measures of speech proficiency in cerebral palsied speakers. Journal of Communi- cation Disorders, 13(5), 385-394.   DOI
20 Dellwo, V., & Wagner, P. (2003, August). Relationships between speech rate and rhythm. Proceedings of the 15th International Congress of the Phonetic Sciences. Barcelona, Spain.
21 Hong, S., & Byeon, H. (2014). Speech rate and pause characteristics in speaker with flaccid dysarthria. The Korea Academia-Industrial Cooperation Society, 15(1), 2930-2936.   DOI
22 Hong, S. M., Jeong, P. Y., & Sim, H. S. (2018). Comparison of perceptual assessment for dysarthric speech: The detailed and general assessments. Communication Sciences & Disorders, 23(1), 242-253.   DOI
23 Janbakhshi, P., Kodrasi, I., & Bourlard, H. (2019, May). Pathological speech intelligibility assessment based on the short-time objective intelligibility measure. Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, UK.
24 Kadi, K. L., Selouani, S. A., Boudraa, B., & Boudraa, M. (2013, October). Discriminative prosodic features to assess the dysarthria severity levels. Proceedings of the World Congress on Engi- neering. London, UK.
25 Hernandez, A., Yeo, E. J., Kim, S., & Chung, M. (2020). Dysarthria detection and severity assessment using rhythm-based metrics. Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2020) (pp. 2897-2901). Shanghai, China.
26 Boersma, P., & Weenink, D. (2001). Praat, a system for doing phonetics by computer. Glot International, 5(9/10), 341-345.
27 Darley, F. L., Aronson, A. E., & Brown, J. R. (1969). Differential diagnostic patterns of dysarthria. Journal of Speech and Hearing Research, 12(2), 246-269.   DOI