[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2020.11.002

PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using Actor-Critic Reinforcement Learning Algorithm

Shen, Si (College of Computer Science and Technology, Zhejiang University of Technology)
Shen, Guojiang (College of Computer Science and Technology, Zhejiang University of Technology)
Shen, Yang (College of Computer Science and Technology, Zhejiang University of Technology)
Liu, Duanyang (College of Computer Science and Technology, Zhejiang University of Technology)
Yang, Xi (College of Computer Science and Technology, Zhejiang University of Technology)
Kong, Xiangjie (College of Computer Science and Technology, Zhejiang University of Technology)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.11, 2020 , pp. 4268-4289 More about this Journal

Abstract

Advanced traffic signal timing method plays very important role in reducing road congestion and air pollution. Reinforcement learning is considered as superior approach to build traffic light timing scheme by many recent studies. It fulfills real adaptive control by the means of taking real-time traffic information as state, and adjusting traffic light scheme as action. However, existing works behave inefficient in complex intersections and they are lack of feasibility because most of them adopt traffic light scheme whose phase sequence is flexible. To address these issues, a novel adaptive traffic signal timing scheme is proposed. It's based on actor-critic reinforcement learning algorithm, and advanced techniques proximal policy optimization and generalized advantage estimation are integrated. In particular, a new kind of reward function and a simplified form of state representation are carefully defined, and they facilitate to improve the learning efficiency and reduce the computational complexity, respectively. Meanwhile, a fixed phase sequence signal scheme is derived, and constraint on the variations of successive phase durations is introduced, which enhances its feasibility and robustness in field applications. The proposed scheme is verified through field-data-based experiments in both medium and high traffic density scenarios. Simulation results exhibit remarkable improvement in traffic performance as well as the learning efficiency comparing with the existing reinforcement learning-based methods such as 3DQN and DDQN.

Keywords

Traffic signal timing; reinforcement learning; actor-critic; proximal policy optimization; generalized advantage estimation;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	N. Casas, "Deep Deterministic Policy Gradient for Urban Traffic Light Control," arXiv:1703.09035, 2017.
2	E. van der Pol and F. A. Oliehoek, "Coordinated deep reinforcement learners for traffic light control," in Proc. of 30th Conf. Neural Inf. Process. Syst., no. Nips, p. 8, 2016.
3	W. Genders and S. N. Razavi, "Using a Deep Reinforcement Learning Agent for Traffic Signal Control," arXiv:1611.01142, 2016.
4	J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal Policy Optimization Algorithms," arXiv:1707.06347, 2017.
5	J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, "High-Dimensional Continuous Control Using Generalized Advantage Estimation," arXiv:1506.02438, 2015.
6	T. Degris et al., "Model-Free Reinforcement Learning with Continuous Action in Practice," in Proc. of 2012 American Control Conference (ACC), pp. 2177-2182, 2012.
7	C. J. Watkins and P. Dayan, "Q-learning," Mach. Learn., vol. 8, pp. 279-292, 1992. DOI
8	T. L. Thorpe and C. W. Anderson, "Traffic Light Control Using SARSA with Three State Representations," IBM Corp., 1996.
9	K.-L. A. Yau, J. Qadir, H. L. Khoo, M. H. Ling, and P. Komisarczuk, "A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control," ACM Comput. Surv., vol. 50, no. 3, pp. 1-38, 2017.
10	P. Mannion, J. Duggan, and E. Howley, "An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control," Auton. Road Transp. Support Syst., pp. 47-66, 2016.
11	I. Arel, C. Liu, T. Urbanik, and A. G. Kohls, "Reinforcement learning-based multi-agent system for network traffic signal control," IET Intell. Transp. Syst., vol. 4, no. 2, pp. 128-135, 2010. DOI
12	S. El-Tantawy, B. Abdulhai, and H. Abdelgawad, "Design of reinforcement learning parameters for seamless application of adaptive traffic signal control," J. Intell. Transp. Syst. Technol. Planning, Oper., vol. 18, no. 3, pp. 227-245, 2014, DOI
13	G. Shen, L. Zhu, J. Lou, S. Shen, Z. Liu, and L. Tang, "Infrared Multi-Pedestrian Tracking in Vertical View via Siamese Convolution Network," IEEE Access, vol. 7, pp. 42718-42725, 2019. DOI
14	W. Genders and S. Razavi, "Evaluating reinforcement learning state representations for adaptive traffic signal control," Procedia Comput. Sci., vol. 130, pp. 26-33, 2018. DOI
15	J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, "Trust Region Policy Optimization," in Proc. of the 32nd International Conference on Machine Learning, vol. 37, pp. 1889-1897, 2015.
16	W. Genders and S. Razavi, "Asynchronous n-step Q-learning adaptive traffic signal control," J. Intell. Transp. Syst. Technol. Planning, Oper., vol. 23, no. 4, pp. 319-331, 2019, DOI
17	X. Kong, J. Cao, H. Wu, and C. H. (Robert) Hsu, "Mobile Crowdsourcing and Pervasive Computing for Smart Cities," Pervasive Mob. Comput., vol. 61, 2020.
18	F. Zhang, J. Bai, X. Li, C. Pei, and V. Havyarimana, "An ensemble cascading extremely randomized trees framework for short-term traffic flow prediction," KSII Trans. Internet Inf. Syst., vol. 13, no. 4, pp. 1975-1988, 2019. DOI
19	R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed., The MIT Press, Cambridge, 2018.
20	M. Wiering, "Multi-agent Reinforcement Learning for Traffic Light Control," in Proc. of the Seventeenth International Conference on Machine Learning, pp. 1151-1158, 2000.
21	B. Abdulhai, R. Pringle, and G. J. Karakoulas, "Reinforcement Learning for True Adaptive Traffic Signal Control," J. Transp. Eng., vol. 129, no. 3, pp. 278-285, 2003. DOI
22	J. Jin and X. Ma, "A group-based traffic signal control with adaptive learning ability," Eng. Appl. Artif. Intell., vol. 65, pp. 282-293, 2017. DOI
23	H. Wei, H. Yao, G. Zheng, and Z. Li, "IntelliLight: A reinforcement learning approach for intelligent traffic light control," in Proc. of ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 2496-2505, 2018.
24	J. Gao, Y. Shen, J. Liu, M. Ito, and N. Shiratori, "Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network," arXiv:1705.02755, 2017.
25	G. Shen and Y. Yang, "A dynamic signal coordination control method for urban arterial roads and its application," Front. Inf. Technol. Electron. Eng., vol. 17, no. 9, pp. 907-918, 2016. DOI
26	X. Kong et al., "Mobile Edge Cooperation Optimization for Wearable Internet of Things: A Network Representation-based Framework," IEEE Trans. Ind. Informatics, pp. 1-1, 2020.
27	M. Abdoos, N. Mozayani, and A. L. C. Bazzan, "Hierarchical control of traffic signals using Q-learning with tile coding," Appl. Intell., vol. 40, no. 2, pp. 201-213, 2013. DOI
28	L. Li, Y. Lv, and F. Y. Wang, "Traffic signal timing via deep reinforcement learning," IEEE/CAA J. Autom. Sin., vol. 3, no. 3, pp. 247-254, 2016. DOI
29	S. S. Mousavi, M. Schukat, and E. Howley, "Traffic light control using deep policy-gradient and value-function-based reinforcement learning," IET Intell. Transp. Syst., vol. 11, no. 7, pp. 417-423, 2017. DOI
30	X. Liang, X. Du, G. Wang, and Z. Han, "A Deep Reinforcement Learning Network for Traffic Light Cycle Control," IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1243-1253, 2019. DOI
31	M. Aslani, M. S. Mesgari, and M. Wiering, "Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events," Transp. Res. Part C Emerg. Technol., vol. 85, pp. 732-752, 2017. DOI
32	H. Pang and W. Gao, "Deep Deterministic Policy Gradient for Traffic Signal Control of Single Intersection," in Proc. of the 31st Chinese Control and Decision Conference, CCDC 2019, pp. 5861-5866, 2019.
33	C.-H. Wan and M.-C. Hwang, "Value-based deep reinforcement learning for adaptive isolated intersection signal control," IET Intell. Transp. Syst., vol. 12, no. 9, pp. 1005-1010, 2018. DOI
34	G. Zheng et al., "Diagnosing Reinforcement Learning for Traffic Signal Control," arXiv:1905.04716, 2019.
35	R. J. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol. 8, no. 3, pp. 229-256, 1992. DOI