Fig. 1. Visual QA - Standard(A) vs Relational Reasoning(B)
Fig. 2. Visual QA Requiring Relational Reasoning
Fig. 3. Visual QA Architecture with RN (Adam Santoro et al., 2017, Figure 2)
Fig. 4. Text-based QA Architecture with RN
Fig. 5. Our RN-based Visual QA Model Architecture
Fig. 6. Neural Network with Batch Normalization
Fig. 7. Relational Question (A) and Non-relational Question (B) Generated on Our Model
Fig. 8. Our RN-based Model on Visual QA Task
Fig. 9. Accuracy of Each Model with Different Hyper Parameters
Fig. 10. Loss of Each Model with Different Hyper Parameters
Fig. 11. Accuracy of Each Model with Different Learning Rate Set by Random Search Method
Fig. 12. Loss of Each Model with Different Learning Rate Set by Random Search Method
Table 1. Performance Comparison Between RN and Baseline on our Visual QA Task
Table 2. Improved Performance by Hyper Parameters Tuning
Table 3. Comparison of Results on bAbI QA Task Using Different Learning Rate Set by Random Search Method
Table 4. Comparison of Results on Dialog-based LL QA Task Using Different Learning Rate Set by Random Search Method
References
- Adam Santoro, David Raposo, David G.T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, and Timothy Lillicrap, "A simple neural network module for relational reasoning," arXiv: 1706.01427v1, 2017.
- David Raposo, Adam Santoro, David Barrett, Razvan Pascanu, Timothy Lillicrap, and Peter Battaglia, "Discovering objects and their relations from entangled scene representations," arXiv:1702.05068, 2017.
- Nicholas Watters, Andrea Tacchetti, Theophane Weber, Razvan Pascanu, Peter Battaglia, and Daniel Zoran, "Visual Interaction Networks," arXiv:1706.01433v1, 2017.
- Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh, "Vqa: Visual question answering," arXiv:1505.00468v7, 2015.
- Antoine Bordes, Jason Weston, Sumit Chopra, and Tomas Mikolov, "Towards ai-complete question answering: A set of prerequisite toy tasks," arXiv:1502.05698, 2015.
- Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, and Rob Fergus, "End-To-End Memory Networks," arXiv:1503.08895v5, 2015.
- Sergey Ioffe and Christan Szegedy, "Batch Normalization : Accelerating Deep Network Training by Reducing Internal Covariate Shift," arXiv:1502.03167, 2015.
- N. Srivastava, G. Hinton. A. Krizhevsky. I. Sutskever, and R. Salakhutdinov, "Dropout : A simple way to prevent neural networks from overfitting," The Journal of Machine Learning Research, 15, pp.1929-1958, 2014.
- Diederik Kingma and Jimmy Ba, "Adam : A Method for Stochastic Optimization," arXiv: 1412.6980, 2014.
- James Bergstra and Yoshua Bengio, "Random Search for Hyper Parameter Optimization," Journal of Machine Learning Research, Vol.13, pp.281-305, 2012.
- Jason Weston, "Dialog-based Language Learning," arXiv: 1604.06045, 2016.
- Sebastian Ruder, "An overview of gradient descent optimization algorithms," arXiv:1609.04747v2, 2016.
- Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton, "On the importance of initialization and momentum in deep learning," Proceedings of the 30th International Conference on Machine Learning, pp.1139-1147, 2013.
- John Duchi, Elad Hazan, and Yoram Singer, "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization," Journal of Machine Learning Research, Vol.12, pp.2121-2159, 2011.
- Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. "Practical Bayesian optimization of machine learning algorithms," arXiv:1206.2944v2, 2012.
- Matthias Feurer, Benjamin Letham, and Eytan Bakshy, "Scalable Meta-Learning for Bayesian Optimization," arXiv: 1802.02219, 2018.
- Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C Lawrence Zitnick, and Ross Girshick., "Clevr: A diagnostic dataset for compositional language and elementary visual reasoning," arXiv:1612.06890v1, 2017.