[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3745/KTCCS.2022.11.9.281

Collision Avoidance Path Control of Multi-AGV Using Multi-Agent Reinforcement Learning

Choi, Ho-Bin (한국기술교육대학교 컴퓨터공학과 미래융합공학전공)
Kim, Ju-Bong (한국기술교육대학교 컴퓨터공학과 미래융합공학전공)
Han, Youn-Hee (한국기술교육대학교 컴퓨터공학과 미래융합공학전공)
Oh, Se-Won (한국전자통신연구원)
Kim, Kwi-Hoon (한국교원대학교 인공지능융합교육전공)

Publication Information

KIPS Transactions on Computer and Communication Systems / v.11, no.9, 2022 , pp. 281-288 More about this Journal

Abstract

AGVs are often used in industrial applications to transport heavy materials around a large industrial building, such as factories or warehouses. In particular, in fulfillment centers their usefulness is maximized for automation. To increase productivity in warehouses such as fulfillment centers, sophisticated path planning of AGVs is required. We propose a scheme that can be applied to QMIX, a popular cooperative MARL algorithm. The performance was measured with three metrics in several fulfillment center layouts, and the results are presented through comparison with the performance of the existing QMIX. Additionally, we visualize the transport paths of trained AGVs for a visible analysis of the behavior patterns of the AGVs as heat maps.

Keywords

Fulfillment Center; Warehouse; AGV; Path Control; MARL;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Z. Han, D. Wang, F. Liu, and Z. Zhao, "Multi-AGV path planning with double-path constraints by using an improved genetic algorithm," PloS one, Vol.12, No.7, 2017.
2	Y. Yang, J. Li, and L. Peng, "Multi-robot path planning based on a deep reinforcement learning DQN algorithm," CAAI Transactions on Intelligence Technology, Vol.5, No.3, pp.177-183, 2020. DOI
3	L. Busoniu, R. Babuska, and B. Schutter, "Multi-agent reinforcement learning: An overview," Innovations in Multi-agent Systems and Applications-1, pp.183-221, 2010.
4	J. Cui, Y. Liu, and A. Nallanathan, "Multi-agent reinforcement learning-based resource allocation for UAV networks," IEEE Transactions on Wireless Communications, Vol.19, No.2, pp.729-743, 2019. DOI
5	X. Li, X. Hu, W. Li, and H. Hu, "A multi-agent reinforcement learning routing protocol for underwater optical sensor networks," In Proceedings of IEEE International Conference on Communications, 2019.
6	F. A. Oliehoek, M. T. J. Spaan, and N. Vlassis, "Optimal and approximate Q-value functions for decentralized POMDPs," Journal of Artificial Intelligence Research, Vol.32, pp.289-353, 2008. DOI
7	J. J. Enright and P. R. Wurman, "Optimization and coordinated autonomy in mobile fulfillment systems," In Proceedings of the AAAI Workshop on Automated Action Planning for Autonomous Mobile Robots, pp.33-38, 2011.
8	J. Bae and W. Chung, "A heuristic for a heterogeneous automated guided vehicle routing problem," International Journal of Precision Engineering and Manufacturing, Vol.18, No.6, pp.795-801, 2017. DOI
9	Y. Lian and W. Xie, "Improved A* multi-AGV path planning algorithm based on grid-shaped network," In 2019 Chinese Control Conference, 2019.
10	R. Kamoshida and Y. Kazama, "Acquisition of automated guided vehicle route planning policy using deep reinforcement learning," IEEE International Conference on Advanced Logistics and Transport (ICALT), 2017.
11	J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling," In NIPS 2014 Workshop on Deep Learning, 2014.
12	D. Ha, A. Dai, and Q. V. Le, "Hypernetworks," In Proceedings of the International Conference on Learning Representations (ICLR), 2017.
13	C. J. C. H. Watkins and P. Dayan, "Q-learning," Machine Learning, Vol.8, pp.279-292, 1992. DOI
14	V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, Vol.518, No.7540, pp.529-533, 2015. DOI
15	M. Tan, "Multi-agent reinforcement learning: Independent vs. cooperative agents," In Proceedings of the Tenth International Conference on Machine Learning, pp.330-337, 1993.
16	P. Sunehag et al., "Value-decomposition networks for co-operative multi-agent learning based on team reward," In Proceedings of 17th International Conference on Autonomous Agents and Multiagent Systems, Stockholm, Sweden, 2018.
17	T. Rashid, M. Samvelyan, C. S. de Witt, G. Farquhar, J. Foerster, and S. Whiteson, "QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning," In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 2018.
18	O. Vinyals et al., "Starcraft II: A new challenge for reinforcement learning," arXiv preprint arXiv:1708.04782, 2017.
19	D. Ye et al., "Mastering complex control in moba games with deep reinforcement learning," In Proceedings of the AAAI Conference on Artificial Intelligence, pp.6672-6679, 2020.
20	S. Huang and S. Ontanon, "A closer look at invalid action masking in policy gradient algorithms," arXiv preprint arXiv:2006.14171, 2020. DOI
21	X. Li, J. Zhang, J. Bian, Y. Tong, and T. Liu, "A cooperative multi-agent reinforcement learning framework for resource balancing in complex logistics network," In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019.
22	F. A. Oliehoek and C. Amato, "A concise introduction to decentralized POMDPs," SpringerBriefs in Intelligent Systems, Springer, 2016.

KSCI

Collision Avoidance Path Control of Multi-AGV Using Multi-Agent Reinforcement Learning 다중 에이전트 강화학습을 이용한 다중 AGV의 충돌 회피 경로 제어

Collision Avoidance Path Control of Multi-AGV Using Multi-Agent Reinforcement Learning