[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2020.08.006

A Method of Evaluating Korean Articulation Quality for Rehabilitation of Articulation Disorder in Children

Lee, Keonsoo (Convergence Institute of Medical Information Communication Technology and Management, Soonchunhyang University)
Nam, Yunyoung (Department of Computer Science and Engineering, Soonchunhyang University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.8, 2020 , pp. 3257-3269 More about this Journal

Abstract

Articulation disorders are characterized by an inability to achieve clear pronunciation due to misuse of the articulators. In this paper, a method of detecting such disorders by comparing to the standard pronunciations is proposed. This method defines the standard pronunciations from the speeches of normal children by clustering them with three features which are the Linear Predictive Cepstral Coefficient (LPCC), the Mel-Frequency Cepstral Coefficient (MFCC), and the Relative Spectral Analysis Perceptual Linear Prediction (RASTA-PLP). By calculating the distance between the centroid of the standard pronunciation and the inputted pronunciation, disordered speech whose features locates outside the cluster is detected. 89 children (58 of normal children and 31 of children with disorders) were recruited. 35 U-TAP test words were selected and each word's standard pronunciation is made from normal children and compared to each pronunciation of children with disorders. In the experiments, the pronunciations with disorders were successfully distinguished from the standard pronunciations.

Keywords

Articulation Disorder; LPCC; MFCC; RASTA-PLP; U-TAP;

Citations & Related Records

Reference

1	P. Grunwell, "The phonological analysis of articulation disorders," British Journal of Disorders of Communication, vol. 10, no. 1, pp. 31-42. 1975. DOI
2	B. Dodd, Z. Hua, S. Crosbie, S., A. Holm, A., and A. Ozanne, Diagnostic Evaluation of Articulation and Phonology-US Edition (DEAP), Pearson, San Antonio, TX, 2009.
3	B. Dodd, and A. Bradford, "A comparison of three therapy methods for children with different types of developmental phonological disorder," International Journal of Language & Communication Disorders, vol. 35, no. 2, pp. 189-209, 2000. DOI
4	D. S. Borys, and K. S. Pope, "Dual relationships between therapist and client: A national study of psychologists, psychiatrists, and social workers," Professional Psychology: Research and Practice, vol. 20, no. 5, pp. 283-293, 1989. DOI
5	V. Hodge, and J. Austin, "A survey of outlier detection methodologies," Artificial intelligence review, vol. 22, no. 2, pp. 85-126. 2004. DOI
6	J. J. Song, The Korean language: Structure, use and context, Routledge, 2006.
7	S. V. Vaseghi, "Spectral Subtraction," Advanced Signal Processing and Digital Noise Reduction, pp. 242-260, 1996.
8	L. L. Olson, and S. Jay Samuels, "The relationship between age and accuracy of foreign language pronunciation," The Journal of Educational Research, vol. 66, no. 6, pp. 263-268, 1973. DOI
9	A. Ravishankar, S. Anusha, H. K. Akshatha, A. Raj, S. Jahnavi, and J. Madhura, "A survey on noise reduction techniques in medical images," in Proc. of 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), vol. 1, pp. 385-389, April 20-22, 2017.
10	A. Yelwande, S. Kansal, and A. Dixit, "Adaptive wiener filter for speech enhancement," in Proc. of 2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC), pp. 1-4, August 17-19, 2017.
11	K. Odugu and B. M. S. S. Rao, "New speech enhancement using Gamma tone filters and Perceptual Wiener filtering based on sub banding," in Proc. of 2013 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICSC), pp. 236-241, December 12-14, 2013.
12	S. Kanrar, "Speaker Identification by GMM based i Vector," arXiv:1704.03939 [cs], April 2017.
13	J. D. Markel and A. H. Gray, Linear prediction of speech, Springer-Verlag, New York, 1976.
14	S. Molau, M. Pitz, R. Schluter, and H. Ney, "Computing mel-frequency cepstral coefficients in the power spectrum," in Proc. of 2001 IEEE international conference on Acoustics, Speech, and Signal Processing, 2001.
15	J. Koehler, N. Morgan, H. Hermansky, H. G. Hirsch, and G. Tong, "Integrating RASTA-PLP into speech recognition," in Proc. of ICASSP '94. IEEE international conference on Acoustics, Speech, and Signal Processing, 1994.
16	G. Hinton, L. Deng. D. Yu, G Dahl, A. R. Mohamed, N, Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, B. Kingsbury, and T. Sainath "Deep Neural Networks for Acoustic Modeling in Speech Recognition," IEEE Signal Processing Magazine, vol. 29, no. 6, 2012.
17	Z. Li, Q. Ding, and W. Zhang, "A Comparative Study of Different Distances for Similarity Estimation," Intelligent Computing and Information Science, pp. 483-488, 2011.
18	J. A. Bilmes, "Graphical Models and Automatic Speech Recognition," Mathematical Foundations of Speech and Language Processing, pp. 191-245, 2004.
19	S Renals, N Morgan, H Bourlard, M Cohen, and H Franco, "Connectionist probability estimators in HMM speech recognition," IEEE Transactions on Speech and Audio Processing, vol. 2, no. 1, pp. 161-174, 1994. DOI
20	A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645-6649, May 26-31, 2013.
21	"Samsung Bixby: Your Personal Voice Assistant \| Samsung US," Samsung Electronics America. [Online]
22	"Google Assistant," Google Assistant. [Online]. Available: https://assistant.google.com/.
23	"Ways to Build with Amazon Alexa," [Online]. Available: https://developer.amazon.com/alexa
24	"Siri," Apple. [Online]. Available: https://www.apple.com/siri/
25	"Personal Digital Assistant - Cortana Home Assistant - Microsoft," Microsoft Cortana, your intelligent assistant. [Online]. Available: https://www.microsoft.com/en-us/cortana.
26	"Common Voice by Mozilla," [Online]. Available: https://mzl.la/voice. [Accessed: 05-Feb-2019].
27	"Open-Source Large Vocabulary CSR Engine Julius," [Online]. Available: http://julius.osdn.jp/en_index.php.
28	"Kaldi ASR," [Online]. Available: http://kaldi-asr.org/.
29	N. Shmyrev, "CMUSphinx Open Source Speech Recognition," CMUSphinx Open Source Speech Recognition. [Online]. Available: http://cmusphinx.github.io/.
30	"HTK Speech Recognition Toolkit," [Online]. Available: http://htk.eng.cam.ac.uk/.
31	J. D. O'Connor, Better English Pronunciation, Cambridge University Press, 1980.
32	E. J. Hunter, "A comparison of a child's fundamental frequencies in structured elicited vocalizations versus unstructured natural vocalizations: A case study," International Journal of Pediatric Otorhinolaryngology, vol. 73, no. 4, pp. 561-571, 2009. DOI
33	Y. Kim, H. Park, J. Kang, J. Kim, M. Shin, S. Kim, J Had, "Validity and Reliability Analyses for the Development of Urimal Test of Articulation and Phonology-2," Commun Sci Disord, vol 23, no. 4, pp. 959-970, 2018. DOI
34	E. M. Griebeler, N. Klein, and P. M. Sander, "Aging, Maturation and Growth of Sauropodomorph Dinosaurs as Deduced from Growth Curves Using Long Bone Histological Data: An Assessment of Methodological Constraints and Solutions," PLoS One, vol 8, no. 6, pp.1-17, 2013.