Development of articulatory estimation model using deep neural network |
You, Heejo
(고려대학교 심리학과)
Yang, Hyungwon (고려대학교 영어영문학과) Kang, Jaekoo (고려대학교 영어영문학과) Cho, Youngsun (고려대학교 영어영문학과) Hwang, Sung Hah (고려대학교 영어영문학과) Hong, Yeonjung (고려대학교 영어영문학과) Cho, Yejin (고려대학교 영어영문학과) Kim, Seohyun (고려대학교 영어영문학과) Nam, Hosung (고려대학교) |
1 | Ghosh, P. K. & Narayanan, S. (2011). Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion. The Journal of the Acoustical Society of America, 130(4), EL251-EL257. DOI |
2 | Sondhi, M. M. & Resnick, J. R. (1983). The inverse problem for the vocal tract: Numerical methods, acoustical experiments, and speech synthesis. The Journal of the Acoustical Society of America, 73(3), 985-1002. DOI |
3 | Wilson, I., Gick, B., O'Brien, M. G., Shea, C., & Archibald, J. (2006). Ultrasound technology and second language acquisition research. Proceedings of the 8th Generative Approaches to Second Language Acquisition Conference (GASLA 2006) (pp. 148-152). |
4 | Wrench, A. A., Gibbon, F., McNeill, A. M., & Wood, S. (2002). An EPG therapy protocol for remediation and assessment of articulation disorders. ICSLP. |
5 | Dusan, S. (2001). Methods for integrating phonetic and phonological knowledge in speech inversion. Proceedings of the International Conference on Speech, Signal and Image Processing. Malta. |
6 | Engwall, O. (2006). Evaluation of speech inversion using an articulatory classifier. Proceedings of the 7th International Seminar on Speech Production (pp. 469-476). |
7 | Papcun, G., Hochberg, J., Thomas, T. R., Laroche, F., Zacks, J., & Levy, S. (1992). Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data. The Journal of the Acoustical Society of America, 92(2), 688-700. DOI |
8 | Zacks, J. & Thomas, T. R. (1994). A new neural network for articulatory speech recognition and its application to vowel identification. Computer Speech & Language, 8(3), 189-209. DOI |
9 | Richmond, K. (2001). Mixture density networks, human articulatory data and acoustic-to-articulatory inversion of continuous speech. Proceedings of Workshop on Innovation in Speech Processing (WISP 2001) (pp. 259-276). |
10 | Qin, C. & Carreira-Perpinan, M. A. (2010). Articulatory inversion of american english /r/ by conditional density modes. Proceedings of 11th Annual Conference of the International Speech Communication Association (Interspeech 2010) (pp. 1998-2001) |
11 | Richmond, K., Hoole, P., & King, S. (2011). Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory Corpus. Proceedings of 12th Annual Conference of the International Speech Communication Association (Interspeech 2011) (pp. 1505-1508). |
12 | Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., & Goldstein, L. (2011). Articulatory information for noise robust speech recognition. Audio, Speech, and Language Processing, IEEE Transaction on Audio, Speech, and Language Processing, 19(7), 1913-1924. DOI |
13 | Najnin, S. & Banerjee, B. (2015). Improved speech inversion using general regression neural network. The Journal of the Acoustical Society of America,138(3), EL229-EL235. DOI |
14 | Tu, J. V. (1996). Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. Journal of clinical epidemiology, 49(11), 1225-1231. DOI |
15 | Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554. DOI |
16 | Simpson, A. J. (2015). Taming the ReLU with Parallel Dither in a Deep Neural Network (arXiv preprint). Retrieved from http://arxiv.org/abs/1509.05173 on September 17, 2015 |