Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model

Kim, Kwang-Ho;Lee, Donghyun;Lim, Minkyu;Kim, Ji-Hwan;

doi:10.13064/KSSS.2015.7.4.003

Phonetics and Speech Sciences (말소리와 음성과학)

Volume 7 Issue 4
/
Pages.3-8
/
2015
/
2005-8063(pISSN)
/
2586-5854(eISSN)

Korean Society of Speech Sciences (한국음성학회)

DOI QR Code

Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model

Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소

김광호 (서강대학교) ;
이동현 (서강대학교) ;
임민규 (서강대학교) ;
김지환 (서강대학교)

Received : 2015.08.25
Accepted : 2015.11.23
Published : 2015.12.31

https://doi.org/10.13064/KSSS.2015.7.4.003 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google's Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of-${\left|V\right|}$ coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).

Keywords

References

Bengio, Y., Ducharme, R., Vincent, P. and Jauvin, C. (2003). A neural probabilistic language model, Journal of Machine Learning Research, Vol. 3, 1137-1155.
Bengio, Y. (2009). Learning deep architectures for AI, Journal of Foundations and Trends in Machine Learning, Vol. 2, No. 1, 1-127. https://doi.org/10.1561/2200000006
Schwenk, H. & Gauvain, J. (2005). Training neural network language models on very large corpora, in Proc. Empirical Methods in Natural Language Processing, 201-208.
Arisoy, E., Sainath, T., Kingsbury, B. and Ramabhadran, B. (2012). Deep neural network language models, in Proc. NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, 20-28.
Turney, P. & Pantel, P. (2010) From frequency to meaning: vector space models of semantics, Journal of Artificial Intelligence Research, Vol. 37, No. 1, 141-188. https://doi.org/10.1613/jair.2934
Schutze, H. & Pedersen, J. (1995). Information retrieval based on word sense, in Proc. Symposium on Document Analysis and Information Retrieval, 161-175.
Rubenstein, H. & Goodenough, J. (1965) Contextual correlates of synonymy, Communications of the ACM, Vol. 8, No. 10, 627-633. https://doi.org/10.1145/365628.365657
Bruni, E., Boleda, G., Baroni, M. and Tran, N. (2012). Distributional semantics in technicolor, in Proc. 50th Annual Meeting of the Associations for Computational Linguistics, 136-145.
Mikolov, T. (2013). Word2Vec, https://code.google.com/p/word2vec.
Faruqui, M. & Dyer, C. (2014). Community evaluation and exchange of word vectors at wordvectors.org, in Proc. Associations for Computational Linguistics, 1-6.
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G. and Ruppin, E. (2001). Placing search in context: the concept revisited, in Proc. The Tenth International World Wide Web Conference, 406-414.
Bruni, E., Boleda, G., Baroni, M. and Tran, N. (2012). Distributional semantics in technicolor, in Proc. 50th Annual Meeting of the Associations for Computational Linguistics, 136-145.
Luong, M., Socher, R. and Manning, C. (2013). Better word representations with recursive neural networks for morphology, in Proc. Computational Natural Language Learning, 1-10.

Phonetics and Speech Sciences (말소리와 음성과학)

Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model

Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)