[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2021.03.003

Visual Analysis of Deep Q-network

Seng, Dewen (School of Computer Science and Technology, Hangzhou Dianzi University)
Zhang, Jiaming (School of Computer Science and Technology, Hangzhou Dianzi University)
Shi, Xiaoying (School of Computer Science and Technology, Hangzhou Dianzi University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.3, 2021 , pp. 853-873 More about this Journal

Abstract

In recent years, deep reinforcement learning (DRL) models are enjoying great interest as their success in a variety of challenging tasks. Deep Q-Network (DQN) is a widely used deep reinforcement learning model, which trains an intelligent agent that executes optimal actions while interacting with an environment. This model is well known for its ability to surpass skilled human players across many Atari 2600 games. Although DQN has achieved excellent performance in practice, there lacks a clear understanding of why the model works. In this paper, we present a visual analytics system for understanding deep Q-network in a non-blind matter. Based on the stored data generated from the training and testing process, four coordinated views are designed to expose the internal execution mechanism of DQN from different perspectives. We report the system performance and demonstrate its effectiveness through two case studies. By using our system, users can learn the relationship between states and Q-values, the function of convolutional layers, the strategies learned by DQN and the rationality of decisions made by the agent.

Keywords

Visual Analytics; Deep Reinforcement Learning; Deep-Q-networks;

Citations & Related Records

Reference

1	H. Strobelt, S. Gehrmann, H. Pfister, and A. M. Rush, "Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks," IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 667-676, 2018. DOI
2	K. Lin, R. Zhao, Z. Xu, and J. Zhou, "Efficient large-scale fleet management via multi-agent deep reinforcement learning," in Proc. of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1774-1783, 2018.
3	S. G. Khan, G. Herrmann, F. L. Lewis, T. Piep, and C. Melhuish, "Reinforcement learning and optimal adaptive control: An overview and implementation examples," Annual Reviews in Control, vol. 36, no. 1, pp. 42-59, 2012. DOI
4	S. Ji, Y. Zheng, and Z. Wang, "A Deep Reinforcement Learning-Enabled Dynamic Redeployment System for Mobile Ambulances," in Proc. of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 3, no. 1, pp. 1-20, 2019.
5	J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson, "Understanding neural networks through deep visualization," in Proc. of the 31 st International Conference on Machine Learning, pp. 1-12, 2016.
6	A. Karpathy, J. Johnson, and L. Fei-Fei, "Visualizing and understanding recurrent networks," arXiv preprint arXiv:1506.02078, 2015. Aritcle (CrossRef Link)
7	F. P. Such, V. Madhavan, R. Liu, R. Wang, P. S. Castro, Y. Li, J. Zhi, L. Schubert, M. Bellemare, J. Clune, and J. Lehman, "An atari model zoo for analyzing, visualizing, and comparing deep reinforcement learning agents," in Proc. of the 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019.
8	R. M. Annasamy and K. Sycara, "Towards better interpretability in deep q-networks," in Proc. of the 33rd AAAI Conference on Artificial Intelligence, vol. 33, no. 1, pp. 4561-4569, 2019.
9	M. Kahng, N. Thorat, D. Chau, F. Viegas, and M. Wattenberg, "Gan lab: Understanding complex deep generative models using interactive visual experimentation," IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 310-320, 2019. DOI
10	D. Silver, A. Huang, and C. J. Maddison, "Mastering the game of Go with deep neural networks and tree search," Nature, vol. 529, pp.484-489, 2016. DOI
11	H. Hasselt, A. Guez, and D. Silver, "Deep reinforcement learning with double q-learning," in Proc. of the 30th AAAI Conference on Artificial Intelligence, vol. 30, no. 1, pp. 2094-2100, 2016.
12	H. Wei, G. Zheng, H. Yao, and Z. Li, "Intellilight: A reinforcement learning approach for intelligent traffic light control," in Proc. of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2496-2505, 2018.
13	V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, and J. Veness, "Human-level control through deep reinforcement learning," Nature, vol. 518, pp. 529-533, 2015. DOI
14	F. Hohman, M. Kahng, R. Pienta, and D. H. Chau, "Visual analytics in deep learning: An interrogative survey for the next frontiers," IEEE Transactions on Visualization Computer Graphics, vol. 25, no. 8, pp. 2674-2693, 2019. DOI
15	M. Liu, J. Shi, Z. Li, J. Zhu, and S. Liu, "Towards better analysis of deep convolutional neural networks," IEEE Transactions on Visualization Computer Graphics, vol. 23, no. 1, pp. 91-100, 2017. DOI
16	J. Luo, S. Green, P. Feghali, G. Legrady, and K. Koc, "Visual Diagnostics for Deep Reinforcement Learning Policy Development," arXiv preprint arXiv:1809.06781, 2018.
17	N. D. Nguyen, T. Nguyen, and S. Nahavandi, "System design perspective for human-level agents using deep reinforcement learning: A survey," IEEE Access, vol. 5, pp. 27091-27102, 2017. DOI
18	Y. Li, Y. Zheng, and Q. Yang, "Efficient and Effective Express via Contextual Cooperative Reinforcement Learning," in Proc. of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 510-519, 2019.
19	M. G. Bellemare, W. Dabney, and R. Munos, "A distributional perspective on reinforcement learning," in Proc. of the 34th International Conference on Machine Learning, pp. 449-458, 2017.
20	S. Y. Chen, Y. Yu, Q. Da, J. Tan, and H. Huang, "Stabilizing reinforcement learning in dynamic environment with application to online recommendation," in Proc. of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1187-1196, 2018.
21	J. Wang, L. Gou, H. W. Shen, and H. Yang, "Dqnviz: A visual analytics approach to understand deep q-networks," IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 288-298, 2019. DOI
22	L. Maaten and G. Hinton, "Visualizing data using t-SNE," Journal of Machine Learning Research, vol. 9, pp. 2579-2605, 2008.
23	W. Luo, Y. Li, R. Urtasun, and R. Zemel, "Understanding the Effective Receptive Field in Deep Convolutional Neural Networks," in Proc. of the 30th Annual Conference on Neural Information Processing Systems (NIPS), pp. 4905-4913, 2016.
24	J. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, "Striving for simplicity: The all convolutional net," in Proc. of the 3 rd International Conference on Learning Representations (ICLR), 2015.
25	P. J. Kindermans, S. Hooker, J. Adebayo, M. Alber, K. Schutt, S. Dahne, D. Erhan, and B. Kim, "The (un)reliability of saliency methods," Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp. 267-280, 2019.
26	S. Greydanus, A. Koul, J. Dodge, and A. Fern, "Visualizing and understanding atari agents," arXiv preprint arXiv:1711.00138, 2017.
27	T. Zahavy, N. Ben-Zrihem, and S. Mannor, "Graying the black box: Understanding DQNs," in Proc. of the 33rd International Conference on Machine Learning, pp.1899-1908, 2016.
28	Z. Wang, T. Schaul, M. Hessel, H. Hasselt, M. Lanctot, and N. Freitas, "Dueling network architectures for deep reinforcement learning," in Proc. of the 33rd International Conference on Machine Learning, pp. 1995-2003, 2016.