DOI QR코드

DOI QR Code

Designing an Efficient Reward Function for Robot Reinforcement Learning of The Water Bottle Flipping Task

보틀플리핑의 로봇 강화학습을 위한 효과적인 보상 함수의 설계

  • Received : 2018.12.07
  • Accepted : 2019.01.15
  • Published : 2019.05.31

Abstract

Robots are used in various industrial sites, but traditional methods of operating a robot are limited at some kind of tasks. In order for a robot to accomplish a task, it is needed to find and solve accurate formula between a robot and environment and that is complicated work. Accordingly, reinforcement learning of robots is actively studied to overcome this difficulties. This study describes the process and results of learning and solving which applied reinforcement learning. The mission that the robot is going to learn is bottle flipping. Bottle flipping is an activity that involves throwing a plastic bottle in an attempt to land it upright on its bottom. Complexity of movement of liquid in the bottle when it thrown in the air, makes this task difficult to solve in traditional ways. Reinforcement learning process makes it easier. After 3-DOF robotic arm being instructed how to throwing the bottle, the robot find the better motion that make successful with the task. Two reward functions are designed and compared the result of learning. Finite difference method is used to obtain policy gradient. This paper focuses on the process of designing an efficient reward function to improve bottle flipping motion.

Keywords

References

  1. R. S. Sutton and A. G. Barto, "Introduction," Reinforcement Learning: An Introduction, 2nd ed. The MIT Press, 2017, ch. 1, sec. 1-7, pp.1-18.
  2. J. Kober and J. Peters, "Learning Motor Primitives for Robotics," 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, pp. 2112-2118, 2009.
  3. Machine learning with applications to robotics, [Online], http://lasa.epfl.ch/research_new/ML/index.php, Accessed: August 24, 2018
  4. Y. S. Liang, D. Pellier, H. Fiorino, and S. Pesty, "Evaluation of a Robot Programming Framework for Non-Experts using Symbolic Planning Representations," 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Lisbon, Portugal, pp. 1121-1126, 2017.
  5. B. D. Argall, S. Chernova, M. Veloso, and B. Browning, "A survey of robot learning from demonstration," Robotics and Autonomous Systems, vol. 57, no. 5, pp. 469-483, May, 2009. https://doi.org/10.1016/j.robot.2008.10.024
  6. D. Hong, D. Lee, and J. Han, DARwIn OP: Open Platform Humanoid Robot for Research and Education, [Online], http://www.romela.org/darwin-op-open-platform-humanoid-robot-for-research-and-edu, Accessed: August 24, 2018.
  7. J. Kober and J. Peters, "Policy Search for Motor Primitives in Robotics," Advances in Neural Information Processing Systems 21 (NIPS 2008), 2009.
  8. P. Kormushev, S. Calinon, and D. G. Caldwell, "Robot Motor Skill Coordination with EM-based Reinforcement Learning," 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, pp. 3232-3237, 2010.
  9. P. Kormushev, S. Calinon, R. Saegusa, and G. Metta, "Learning the skill of archery by a humanoid robot iCub," 2010 10th IEEE-RAS International Conference on Humanoid Robots, Nashville, TN, USA, pp. 417-423, 2010.
  10. M. Riedmiller, T. Gabel, R. Hafner, and S. Lange, "Reinforcement Learning for Robot Soccer," Autonomous Robots, vol. 27, no.1, pp. 55-73, July, 2009. https://doi.org/10.1007/s10514-009-9120-4
  11. J. Peters and S. Schaal, "Policy Gradient Methods for Robotics," 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 2219-2225, 2006.
  12. S. H. Lee, "Designing an efficient reward function for robot reinforcement learning of the water bottle flipping task," M.S thesis, Sogang University, Seoul, Korea, 2018.

Cited by

  1. 기계식 중력보상 기반의 가정용 5자유도 로봇 팔 vol.15, pp.1, 2019, https://doi.org/10.7746/jkros.2020.15.1.048