참고문헌
- Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, "Gradient-based learning applied to document recognition", Proceedings of the IEEE, vol. 86, issue 11, pp. 2278-2324, November 1998. https://doi.org/10.1109/5.726791
- S. Lawrence, C. L. Giles, A. C. Tsoi, A. D. Back, "Face recognition: a convolutional neural-network approach", IEEE Transactions on Neural Networks, vol. 8, issue 1, pp. 98-113, January 1997. https://doi.org/10.1109/72.554195
- http://www.image-net.org/challenges/LSVRC/2012/
- A. Krizhevsky, I. Sutskever, G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", Advances in Neural Information Processing Systems, 25, pp. 1097-1105, 2012.
- H. Lee, R. Grosse, R. Ranganath, A. Y. Ng, "Convolutional Deep Belief Networks for Scalable Unsupelvised Learning of Hierarchical Representations", Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609-616, 2009.
- http://ufldl.stanford.edu/ tutorial/supervised/ConvolutionalNeuralNetwork/
- Socher, R.; Perelygin, A.; Wu, J. Y.; Chuang, J. ; Manning, C. D.; Ng, A. Y.; and Potts, C. 2013b. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Conference on Empirical Methods in Natural Language Processing, pp. 1631-1642, 2013
- lrsoy, Ozan and Cardie, Claire "Deep Recursive Neural Networks for Compositionality in Language", Advances in Neural Information Processing Systems, 27, pp.2096-2104, 2014.
- D. M. Blei and M. Jordan, "Modeling Annotated Data," Proceedings of the 26th annual ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2003), pp. 127-134, 2003.
- N. Srivastava and R. Salakhutdinov, "Multimodal Learning with Deep Boltzmann Machines," Advances in Neural Information Processing Systems 2012 (NIPS 2012), pp. 2222-2230, 2012.
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, "Show and Tell: A Neural image caption generator," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 3156-3164, 2015.
- K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio, "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention," Proceedings of The 32th International Conference on Machine Learning, 2015.
- N. Srivastava, E. Mansimov, and R. Salakhutdinov, "Unsupervised Learning of Video Representation using LSTMs," Proceedings of The 32th International Conference on Machine Learning, 2015.
- L. Yao, A. Torabi, K. Cho, N. Ballas, C. Pal, H. Larochelle, and A. Courville, "Video description generation incorporating spatio-temporal features and a soft-attention mechanism," arXiv preprint arXiv: 1502.08029, 2015.
- S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Danel, and K. Saenko, "Sequence to Sequence-Video to Text," arXiv preprint arXiv: 1505.00487, 2015.
- J.-W. Ha, K.-M. Kim, and B.-T. Zhang, "Automated Construction of Visual-Linguistic Knowledge via Concept Learning from Cartoon Videos," Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI 2015), pp. 522-528, 2015.