Browse > Article
http://dx.doi.org/10.7746/jkros.2017.12.3.297

Reinforcement Learning Strategy for Automatic Control of Real-time Obstacle Avoidance based on Vehicle Dynamics  

Kang, Dong-Hoon (Automotive Convergence Engineering, Korea University)
Bong, Jae Hwan (Mechanical Engineering, Korea University)
Park, Jooyoung (The Department of Control and Instrumentation Engineering, Korea University)
Park, Shinsuk (Mechanical Engineering, Korea University)
Publication Information
The Journal of Korea Robotics Society / v.12, no.3, 2017 , pp. 297-305 More about this Journal
Abstract
As the development of autonomous vehicles becomes realistic, many automobile manufacturers and components producers aim to develop 'completely autonomous driving'. ADAS (Advanced Driver Assistance Systems) which has been applied in automobile recently, supports the driver in controlling lane maintenance, speed and direction in a single lane based on limited road environment. Although technologies of obstacles avoidance on the obstacle environment have been developed, they concentrates on simple obstacle avoidances, not considering the control of the actual vehicle in the real situation which makes drivers feel unsafe from the sudden change of the wheel and the speed of the vehicle. In order to develop the 'completely autonomous driving' automobile which perceives the surrounding environment by itself and operates, ability of the vehicle should be enhanced in a way human driver does. In this sense, this paper intends to establish a strategy with which autonomous vehicles behave human-friendly based on vehicle dynamics through the reinforcement learning that is based on Q-learning, a type of machine learning. The obstacle avoidance reinforcement learning proceeded in 5 simulations. The reward rule has been set in the experiment so that the car can learn by itself with recurring events, allowing the experiment to have the similar environment to the one when humans drive. Driving Simulator has been used to verify results of the reinforcement learning. The ultimate goal of this study is to enable autonomous vehicles avoid obstacles in a human-friendly way when obstacles appear in their sight, using controlling methods that have previously been learned in various conditions through the reinforcement learning.
Keywords
Reinforcement Learning; Obstacle Avoidance;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 NHTSA (National Highway Traffic Safety Administration), Federal Automated Vehicles Policy, [Online], https://one.nhtsa.gov/nhtsa/av/av-policy.html, Accessed: December 12, 2016.
2 L.A. Zadeh, "Fuzzy sets," Information and Control, vol. 8, no. 3, pp. 338-353, Jun, 1965.   DOI
3 R.B. Tilove, "Local obstacle avoidance for mobile robots based on the method of artificial potentials," in IEEE International Conference on Robotics and Automation, Ohio, USA, pp. 566-571, 1990.
4 A.E. Eiben, P-E. Raue, and Zs. Ruttkay, "Genetic algorithms with multi-parent recombination," in International Conference on Evolutionary Computation the Third Conference on Parallel Problem Solving from Nature, Jerusalem, Israel, pp. 78-87, 1994.
5 R. Malhotra, A. Sarkar, "Development of a fuzzy logic based mobile robot for dynamic obstacle avoidance and goal acquisition in an unstructured environment," in IEEE/ASME International Conference on Advanced Intelligent Mechatronics, California, USA, pp. 1198-1203, 2005.
6 L. Jong-Yeon, J. Hah-Min, and K Dong-Hun, "Amorphous obstacle avoidance based on APF methods for local path planning," Journal of Korean Institute of Intelligent Systems, vol. 1, no. 1, pp. 19-24, Feb, 2011.
7 R.S. Sutton and A.G. Barto, "Introduction" in Reinforcement learning : An Introduction, MIT Press, 2012, ch.1, sec. 1.1, pp. 18-38s
8 G.J. Tesauro, "Temporal difference learning and TDGammon," Communications of the ACM, vol. 38, no. 3, pp. 58-68, Mar, 1995.   DOI
9 S. Lu, X. Liu, and S. Dai, "Incremental multistep Q-learning for adaptive traffic signal control based on delay minimization strategy," in 2008 7th World Congress on Intelligent Control and Automation, Chongqing, China, pp. 2854-2858, 2008.
10 ISO (International Organization for Standardization), ISO 38888-2:2011, [Online], http://www.iso.org/iso/catalogue_detail.htm?csnumber=57253, Accessed : September 16, 2016.
11 C. Watkins, "Learning from delayed rewards," Ph.D. Thesis, King's College, Cambridge, England,1989.
12 B. Beckman's, "Part 7: The traction budget," The Physics of Racing, [Online], http://phors.locost7.info/contents.htm, Accessed : July 03, 2016.