Browse > Article
http://dx.doi.org/10.30693/SMJ.2022.11.11.17

Personalized Speech Classification Scheme for the Smart Speaker Accessibility Improvement of the Speech-Impaired people  

SeungKwon Lee (마인드로직 머신러닝)
U-Jin Choe (한국공학대학교 컴퓨터공학부)
Gwangil Jeon (한국공학대학교 컴퓨터공학부)
Publication Information
Smart Media Journal / v.11, no.11, 2022 , pp. 17-24 More about this Journal
Abstract
With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%.
Keywords
smart speaker; speech-impaired people; disabled accessibility; personalized speech classification scheme; deep learning;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Smart speaker household penetration rate in the United States from 2014 to 2025* (2020), https://www.statista.com/statistics/1022847/united -states-smart-speaker-household-penetration/ (accessed Nov., 11, 2021).
2 장애인차별금지 및 권리구제에 관한 법률시행령, 시행 2016. 2. 3 [대통령령 제 26944호]
3 국가정보화기본법, 시행 2018. 8. 22 [법률 제15369 호, 2018. 2. 21., 일부개정]
4 한국형 웹 콘텐츠 접근성지침 2.1, 미래창조과학부 국립전파연구원) 2015년 3월
5 모바일 애플리케이션 콘텐츠 접근성 지침 2.0 (한국정보통신기술협회) 2015년 12월
6 C. Espana-Bonet and J. A. Fonollosa, "Automatic speech recognition with deep neural networks for impaired speech," in Third International Conference on Advances in Speech and Language Technologies for Iberian Languages. Springer, pp. 97-107, Nov. 2016.
7 Biadsy, F., Weiss, R. J., Moreno, P. J., Kanvesky, D., and Jia, Y. Parrotron: An end-to-end speech-to-speech conversion model and its applications to hearing-impaired speech and speech separation. arXiv preprint arXiv:1904.04169, 2019.
8 J. Cattiau, "How ai can improve products for people with impaired speech," 2019. [Online]. Available : https://blog.google/outreach-initiatives/accessibility/impaired-speech-recognition/ (accessed Oct. 8, 2021).
9 김종우, 윤기현, 허진혁, 전광일, "합성곱 신경망을 이용한 언어장애인용 음성인식," 한국소프트웨어종 합학술대회(KSC2019)논문집, 제46권, 제2호, 1600-1602쪽, 2019년 12월
10 김남호, 최지영, "로그인 과정에서의 화자인증 메커니즘을 이용한 사용자인증 방안 연구," 한국스마 트미디어저널, 제8권, 제3호, 23-30쪽, 2019년 9월
11 이승권, 최우진, 전광일, "언어장애인의 음성문장 인식을 위한 알고리즘 비교," 한국스마트미디어학회 추계학술대회논문집, 제9권, 제2호, 13-15쪽, 2020년 11월
12 Fadhilah Rosdi, Mumtaz Begum Mustafa, Siti Salwah Salim and Nor Azan Mat Zin, "Automatic Speech Intelligibility Detection for Speakers with Speech Impairments: The Identification of Significant Speech Features," Sains Malaysiana, Vol. 48, No. 12, 2019.
13 Colin Lea, Zifang Huang, Lauren Tooley, Zeinab Liaghat, Shri Thelapurath, Leah Findlater, Jeffrey P. Bigham, "Nonverbal Sound Detection for Disordered Speech," IEEE ICASSP, Singapore, May, 2022.
14 Cai S, Lillianfeld L, Seaver K, Green JR, Brenner MP, Nelson PQ, Sculley D, "A Voice-Activated Switch for Persons with Motor and Speech Impairments: Isolated-Vowel Spotting Using Neural Networks," InterSpeech, 2021.
15 Smith, L.N., Topin, N.: Super-convergence: very fast training of neural networks using large learning rates (2018). arXiv:1708.07120
16 Leslie N. Smith, "Cyclical Learning Rates for Training Neural Networks," 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, USA, Mar. 2017.
17 https://github.com/ARBasharat/AudioClassification (accessed Sep. 8, 2022)