딥러닝 모델 adaptation 기술의 연구 동향

Yang, Jun-Yeong;Jang, Jun-Hyeok;

Information and Communications Magazine (정보와 통신)

Volume 33 Issue 9
/
Pages.3-7
/
2016
/
1226-4725(pISSN)

The Korean Institute of Commucations and Information Sciences (한국통신학회)

딥러닝 모델 adaptation 기술의 연구 동향

양준영 (한양대학교) ;
장준혁 (한양대학교)

Published : 2016.08.31

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

딥러닝 기술은 수많은 입력 데이터에 내재하고 있는 특징을 추출 및 합성함으로써 복잡한 특징공간을 모델링할 수 있는 강점을 가지지만, 테스트 환경에서 나타날 수 있는 특정 데이터 분포에 대하여 일반화가 잘 되지 않을 경우에는 해당 데이터를 이용하여 주어진 환경에 모델을 적응시킬 수 있는 기술을 필요로 한다. 이 글에서는 DNN 모델의 adaptation 기술 연구가 가장 활발하게 진행되고 있는 음향모델링에서의 다양한 adaptation 기술을 통해 연구 동향을 알아본다.

Keywords

References

Dahl, George E., et al. "Context-dependent pretrained deep neural networks for large-vocabulary speech recognition." IEEE Transactions on Audio, Speech, and Language Processing 20.1 (2012): 30-42. https://doi.org/10.1109/TASL.2011.2134090
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
Mikolov, T. and J. Dean. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems (2013).
Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013).
Rabiner, Lawrence R. "A tutorial on hidden Markov models and selected applications in speech recognition." Proceedings of the IEEE 77.2 (1989): 257-286. https://doi.org/10.1109/5.18626
Serizel, Romain, and Diego Giuliani. "Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition." Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014.
Leggetter, Christopher J., and Philip C. Woodland. "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models." Computer Speech & Language, 9.2 (1995): 171-185. https://doi.org/10.1006/csla.1995.0010
Parthasarathi, Sree Hari Krishnan, et al. "fMLLR based feature-space speaker adaptation of DNN acoustic models." Sixteenth Annual Conference of the International Speech Communication Association. 2015.
Dehak, Najim, et al. "Front-end factor analysis for speaker verification." IEEE Transactions on Audio, Speech, and Language Processing 19.4 (2011): 788-798. https://doi.org/10.1109/TASL.2010.2064307
Miao, Yajie, Hao Zhang, and Florian Metze. "Speaker adaptive training of deep neural network acoustic models using i-vectors." IEEE/ACM Transactions on Audio, Speech, and Language Processing 23.11 (2015): 1938-1949. https://doi.org/10.1109/TASLP.2015.2457612
Yao, Kaisheng, et al. "Adaptation of contextdependent deep neural networks for automatic speech recognition." Spoken Language Technology Workshop (SLT), 2012 IEEE. IEEE, 2012.
Swietojanski, Pawel, and Steve Renals. "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models." Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014.
Price, Ryan, Ken-ichi Iso, and Koichi Shinoda. "Speaker adaptation of deep neural networks using a hierarchy of output layers." Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014.
Yu, Dong, et al. "KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition." 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013.
Albesano, Dario, et al. "Adaptation of artificial neural networks avoiding catastrophic forgetting." The 2006 IEEE International Joint Conference on Neural Network Proceedings. IEEE, 2006.
Bell, Peter, and Steve Renals. "Regularization of context-dependent deep neural networks with context-independent multi-task training." 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015.
Huang, Zhen, et al. "Rapid adaptation for deep neural networks through multi-task learning." Proc. Interspeech. 2015.
Xue, Jian, et al. "Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network." 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014.
Zhang, C., and P. C. Woodland. "DNN speaker adaptation using parameterised sigmoid and ReLU hidden activation functions." 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016.
Miao, Yajie, and Florian Metze. "On speaker adaptation of long short-term memory recurrent neural networks." Sixteenth Annual Conference of the International Speech Communication Association (INTERSPEECH)(To Appear). ISCA. 2015.
Graves, Alex, and Navdeep Jaitly. "Towards End-To-End Speech Recognition with Recurrent Neural Networks." ICML. Vol. 14. 2014.

Information and Communications Magazine (정보와 통신)

딥러닝 모델 adaptation 기술의 연구 동향

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)