Browse > Article
http://dx.doi.org/10.7471/ikeee.2021.25.4.728

On the Reward Function of Latent SAC Reinforcement Learning to Improve Longitudinal Driving Performance  

Jo, Sung-Bean (Dept. of Electrical Engineering, Pusan National University)
Jeong, Han-You (Dept. of Electrical Engineering, Pusan National University)
Publication Information
Journal of IKEEE / v.25, no.4, 2021 , pp. 728-734 More about this Journal
Abstract
In recent years, there has been a strong interest in the end-to-end autonomous driving based on deep reinforcement learning. In this paper, we present a reward function of latent SAC deep reinforcement learning to improve the longitudinal driving performance of an agent vehicle. While the existing reward function significantly degrades the driving safety and efficiency, the proposed reward function is shown to maintain an appropriate headway distance while avoiding the front vehicle collision.
Keywords
Reinforcement learning; soft actor-critic; end-to-end learning; reward function; autonomous driving;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Z. Zhu and H. Zhao, "A survey of deep RL and IL for autonomous driving policy learning," arXiv preprint, arXiv:2101.01993, 2021.
2 H. Abdou et al, "End-to-end deep conditional imitation learning for autonomous driving," Proc. of IEEE ICM'19, pp.346-334, 2019.
3 M. Bansal, K. Alex, and O. Abhijit, "Chauffeurnet: Learning to drive by imitating the best and synthesizing the wors," arXiv preprint arXiv: 1812.03079, 2018.
4 W. Zeng et al. "End-to-end interpretable neural motion planner," Proc. of the IEEE CVPR'19, 2019.
5 J. Chen, E. L. Shengbo, and T. Masayoshi, "Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning," IEEE Trans on Intelli. Transpt. Syst., 2021.
6 A. Dosovitskiy et al. "CARLA: An open urban driving simulator," Conf. on Robot Learning. 2017.
7 V. Mnih et al. "Human-level control through deep reinforcement learning," Nature, vol.518, no.7540 pp.529-533, 2015.   DOI
8 T. P. Lillicrap et al. "Continuous control with deep reinforcement learning," arXiv preprint, arXiv: 1349.02971, 2015.
9 D. P. Kingma, and W. Max, "Auto-encoding variational bayes," arXiv preprint arXiv:1312.6114, 2013.
10 D. Zhao, Z. Xia, and Q. Zhang, "Model-free optimal control based intelligent cruise control with hardware-in-the-loop demonstration," IEEE Comput. Intelli. Mag., vol.12, no.2, pp.56-69, 2017.   DOI
11 C. Desjardins and B. Chaib-Draa, "Cooperative adaptive cruise control: A reinforcement learning approach," IEEE Trans. on intelli. transpt. syst., vol.12, no.4, pp.1248-1260, 2011.   DOI
12 R. J. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Machine Learning, vol.8, no.3, pp.229-256, 1992. DOI: 10.1007/BF00992696   DOI
13 T. Haarnoja et al. "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor," Intern. Conf. on Machine Learning, 2018.