DOI QR코드

DOI QR Code

A Study on the Recognition of English Pronunciation based on Artificial Intelligence

인공지능 기반 영어 발음 인식에 관한 연구

  • 이철승 (광주여자대학교 AI융합학과) ;
  • 백혜진 (광주여자대학교 교양과정부)
  • Received : 2021.04.14
  • Accepted : 2021.06.17
  • Published : 2021.06.30

Abstract

Recently, the fourth industrial revolution has become an area of interest to many countries, mainly in major advanced countries. Artificial intelligence technology, the core technology of the fourth industrial revolution, is developing in a form of convergence in various fields and has a lot of influence on the edutech field to change education innovatively. This paper builds an experimental environment using the DTW speech recognition algorithm and deep learning on various native and non-native data. Furthermore, through comparisons with CNN algorithms, we study non-native speakers to correct them with similar pronunciation to native speakers by measuring the similarity of English pronunciation.

최근 4차 산업혁명은 주요 선진국을 중심으로 세계의 국가들의 관심을 갖는 분야가 되고 있다. 4차 산업혁명 기술의 핵심기술인 인공지능기술은 다양한 분야에 융합하는 형태로 발전하고 있으며, 에듀테크 분야에도 많은 영향을 미치고 있으며 교육을 혁신적으로 변화하기 위해 많은 관심과 노력을 하고 있다. 본 논문은 DTW 음성인식 알고리즘을 이용하여 실험환경을 구축하고 다양한 원어민 데이터와 비원어민 데이터를 딥러닝 학습하고, CNN 알고리즘과의 비교를 통해 영어 발음의 유사도를 측정하여 비원어민이 원어민과 유사한 발음으로 교정할 수 있도록 연구한다.

Keywords

Acknowledgement

"본 연구결과는 2021학년도 광주여자대학교 교내연구비 지원에 의하여 연구되었음". (KWUI21-027)

References

  1. H. Jeon, H. Chung, B. Kang, and Y. Lee, "Survey of Recent Research in Education based on Artificial Intelligence," J. of the Korea Institute of Electronics and Telecommunications Trends (ETRI), vol. 36, no. 1, Feb. 2021, pp. 71-80.
  2. T. Kim, M. Ryu, and S. Han, "Framework Research for AI Education for Elementary and Middle School Students," J. of The Korean Association of Artificial Intelligence Education, vol. 1, no. 1, 2020, pp. 31-42.
  3. M. Cho, "A Study on the History, Classification and Development Direction of Artificial Intelligence," J. of the Korea Institute of Electronic Communication Sciences, vol. 16, no. 2, Apr. 2021, pp. 307-312. https://doi.org/10.13067/JKIECS.2021.16.2.307
  4. J. Kim, "The Development and Application of the Web-based English Pronunciation Learning Material Focused on Suprasegmental Features," J. of the Ewha Education, vol. 13, 2003, pp. 443-457..
  5. E. Kim, and W. Son, "STT, Scheduling, Automatic subtitle generation, STT post-processing, Sentence analysis," J. of the Korea Institute of Electronic Communication Sciences, vol. 16, no. 1, Feb. 2021, pp. 81-88. https://doi.org/10.13067/JKIECS.2021.16.1.81
  6. S. Kim, and M. Park, " A Study on Time-series Clustering Analysis based on Dynamic Time Warping," J. of the Korean Data Analysis Society, vol. 20, no. 5, 2018, pp. 2319-2332. https://doi.org/10.37727/jkdas.2018.20.5.2319
  7. K. Heo, and D. Lim, "Noise reduction using patch-based CNN in images," J. of the Korean Data Analysis Society, vol. 20, no. 5, 2018, pp. 2319-2332. https://doi.org/10.37727/jkdas.2018.20.5.2319
  8. Y. Jeong, and G. Choi, "Efficient iris recognition using deep-learning convolution neural network(CNN)," J. of the Korean Data & Information Science Society, vol. 30, no. 2, June 2019, pp. 349-363. https://doi.org/10.7465/jkdi.2019.30.2.349
  9. W. K. Leung, X. Liu, and H. Meng, "CNN-RNN-CTC Based End-to-end Mispronunciation Detection and Diagnosis," In Proc. IEEE Intf. Conf. Acoustics, Speech and Signal (ICASSP), Brighton, UK, May 2019.
  10. B. Kang and O. Kwon, "DNN-based acoustic modeling for speech recognition fo native and foreign speakers," J. of the Phonetics and Speech Sciences, vol. 9, no. 2. 2017, pp. 95-101.
  11. F. Nazir, M. N. Majeed, M. A. Ghazanfar, and M. Maqsood, "Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes," J. of the IEEE Access, vol. 7, Apr. 2019, pp. 52589-52608. https://doi.org/10.1109/ACCESS.2019.2912648