Browse > Article
http://dx.doi.org/10.56977/jicce.2022.20.3.181

Autonomous and Asynchronous Triggered Agent Exploratory Path-planning Via a Terrain Clutter-index using Reinforcement Learning  

Kim, Min-Suk (Department of Human Intelligence and Robot Engineering, Sangmyung University)
Kim, Hwankuk (Department of Information Security Engineering, Sangmyung University)
Abstract
An intelligent distributed multi-agent system (IDMS) using reinforcement learning (RL) is a challenging and intricate problem in which single or multiple agent(s) aim to achieve their specific goals (sub-goal and final goal), where they move their states in a complex and cluttered environment. The environment provided by the IDMS provides a cumulative optimal reward for each action based on the policy of the learning process. Most actions involve interacting with a given IDMS environment; therefore, it can provide the following elements: a starting agent state, multiple obstacles, agent goals, and a cluttered index. The reward in the environment is also reflected by RL-based agents, in which agents can move randomly or intelligently to reach their respective goals, to improve the agent learning performance. We extend different cases of intelligent multi-agent systems from our previous works: (a) a proposed environment-clutter-based-index for agent sub-goal selection and analysis of its effect, and (b) a newly proposed RL reward scheme based on the environmental clutter-index to identify and analyze the prerequisites and conditions for improving the overall system.
Keywords
Intelligent Distributed Multi-Agent System (IDMS); Reinforcement Learning (RL); Sub-Goal; Environment-Clutter-Index;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 K. Zhang, Z. Yang, and T. Basar, "Multi-agent reinforcement learning: a selective overview of theories and algorithms," in Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol. 325. Springer, Cham, 2021.
2 J. Blumenthal, D. B. Megherbi, and R. Lussier, "Unsupervised machine learning via Hidden Markov Models for accurate clustering of plant stress levels based on imaged chlorophyll fluorescence profiles & their rate of change in time," Computers and Electronics in Agriculture, vol. 174, Jul. 2020. DOI: 10.1016/j.compag.2019.105064.   DOI
3 D. Xu and T. Ushio, "On stability of consensus control of discrete-time multi-agent systems by multiple pinning agents," IEEE Control Systems Letters, vol. 3, no. 4, pp. 1038-1043, Oct. 2019. DOI: 10.1109/LCSYS.2019.2920207.   DOI
4 M. Madera and D. B. Megherbi, "An interconnected dynamical system composed of dynamics-based reinforcement learning agents in a distributed environment: A case study," in Proceedings of IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, Tianjin, China, pp. 63-68, 2012. DOI: 10.1109/CIMSA.2012.6269597.   DOI
5 J. C. Bol and J. Leiby, "Status motives and agent-to-agent information sharing: how evolutionary psychology shapes agents' Responses to Control System Design," AAA 2016 Management Accounting Section (MAS) Meeting Paper, Aug. 2015. DOI: 10.2139/ssrn.2645804.   DOI
6 H. S. Al-Dayaa and D. B. Megherbi, "Towards a multiple-lookahead-levels reinforcement-learning technique and its implementation in integrated circuits," The Journal of Supercomputing, vol. 62, no. 1, pp. 588-615, Jan. 2012. DOI: 10.1007/s11227-011-0738-6.   DOI
7 Y. Duan, N. Wang, and J. Wu, "Minimizing training time of distributed machine learning by reducing data communication," IEEE Transactions on Network Science and Engineering, vol. 8, no. 2, pp. 1802-1814, Apr. 2021. DOI: 10.1109/TNSE.2021.3073897.   DOI
8 J. B. Kim, H. -K. Lim, C. -M. Kim, M. -S. Kim, Y. -G. Hong, and Y. -H. Han, "Imitation reinforcement learning-based remote rotary inverted pendulum control in openflow network," IEEE Access, vol. 7, pp. 36682 - 36690, Mar. 2019. DOI: 10.1109/ACCESS.2019.2905621.   DOI
9 H. Qie, D. Shi, T. Shen, X. Xu, Y. Li, and L. Wang, "Joint optimization of multi-UAV target assignment and path planning based on multi-agent reinforcement learning," IEEE Access, vol. 7, pp. 146264-146272, Sep. 2019. DOI: 10.1109/ACCESS.2019.2943253.   DOI
10 S. Zheng and H. Liu, "Improved multi-agent deep deterministic policy gradient for path planning-based crowd simulation," IEEE Access, vol. 7, pp. 147755-147770, Oct. 2019. DOI: 10.1109/ACCESS.2019.2946659.   DOI
11 M. Saim, S. Ghapani, W. Ren, K. Munawar, and U. M. Al-Saggaf, "Distributed average tracking in multi-agent coordination: extensions and experiments," IEEE Systems Journal, vol. 12, no. 3, pp. 2428-2436, Apr. 2018. DOI: 10.1109/JSYST.2017.2685465.   DOI
12 B. Brito, M. Everett, J. P. How, and J. Alonso-Mora, "Where to go next: Learning a subgoal recommendation policy for navigation in dynamic environments," IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4616-4623, Jul. 2021. DOI: 10.1109/LRA.2021.3068662.   DOI
13 D. Bertsekas, "Multiagent reinforcement learning: Rollout and policy iteration," IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 2, pp. 249-272, Feb. 2021. DOI: 10.1109/JAS.2021.1003814.   DOI
14 R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT press, 2018.
15 X. Gan, H. Guo, and Z. Li, "A new multi-agent reinforcement learning method based on evolving dynamic correlation matrix," IEEE Access, vol. 7, pp. 162127-162138, Oct. 2019. DOI: 10.1109/ACCESS.2019.2946848.   DOI
16 Megherbi D. B., Malaya, "A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning", The Journal of Supercomputing, Vol. 59, Issue 3, p 1188-121, 2012,https://doi.org/10.1007/s11227-010-0510-3.   DOI
17 L. Canese, G. C. Cardarilli, L. D. Nunzio, R. Fazzolari, D. Giardino, M. Re, and S. Spano, "Multi-agent reinforcement learning: A review of challenges and applications," Applied Science, vol. 11, no. 11, p. 4948, May. 2021. DOI: 10.3390/app11114948.   DOI
18 C. Liu, F. Zhu, Q. Liu, and Y. Fu, "Hierarchical reinforcement learning with automatic sub-goal identification," IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 10, pp. 1686-1696, Oct. 2021. DOI: 10.1109/JAS.2021.1004141.   DOI
19 M. -S. Kim, "A study of collaborative and distributed multi-agent path-planning using reinforcement learning," Journal of The Korea Society of Computer and Information, vol. 26, no. 3, pp. 9-17, Mar. 2021. DOI: 10.9708/jksci.2021.26.03.009.   DOI
20 Y. Bicen and F. Aras, "Intelligent condition monitoring platform combined with multi-agent approach for complex systems," in 2014 IEEE Workshop on Environmental, Energy, and Structural Monitoring Systems Proceedings, Naples, Italy, pp. 1-4, 2014. DOI: 10.1109/EESMS.2014.6923283.   DOI
21 Z. Li, L. Gao, W. Chen, and Y. Xu, "Distributed adaptive cooperative tracking of uncertain nonlinear fractional-order multi-agent systems," IEEE/CAA Journal of Automatica Sinica, vol. 7, no. 1, pp. 292-300, Jan. 2020. DOI: 10.1109/JAS.2019.1911858.   DOI
22 S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 4th ed. Prentice Hall, 2021.
23 D. B. Megherbi, M. Kim, and M. Madera, "A study of collaborative distributed multi-goal and multi-agent based systems for large critical key infrastructures and resources (CKIR) dynamic monitoring and surveillance," in IEEE International Conference on Technologies for Homeland Security, Waltham: MA, USA, pp. 687-692, 2013. DOI: 10.1109/THS.2013.6699087.   DOI
24 D. B. Megherbi and V. Malaya, "A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning," The Journal of Supercomputing, vol. 59, no. 3, pp. 1188-1217, Dec. 2012. DOI: 10.1007/s11227-010-0510-3.   DOI
25 H. Lee and S. W. Cha, "Reinforcement learning based on equivalent consumption minimization strategy for optimal control of hybrid electric vehicles," IEEE Access, vol. 9, pp. 860-871, 2021. DOI: 10.1109/ACCESS.2020.3047497.   DOI
26 H. S. AI-Dayaa and D. B. Megherbi, "Reinforcement learning technique using agent state occurrence frequency with analysis of knowledge sharing on the agent's learning process in multi-agent environments," The Journal of Supercomputing, vol. 59, no. 1, pp. 526-547, Jun. 2010. DOI: 10.1007/s11227-010-0451-x.   DOI
27 W. Wang, W. Zhang, C. Yan, and Y. Fang, "Distributed adaptive bipartite time-varying formation control for heterogeneous unknown nonlinear multi-agent systems," IEEE Access, vol. 9, pp. 52698-52707, Mar. 2021. DOI: 10.1109/ACCESS.2021.3068966.   DOI
28 D. B. Megherbi and M. Kim, "A hybrid P2P and master-slave cooperative distributed multi-agent reinforcement learning system with asynchronously triggered exploratory trials and clutter-index-based selected sub-goals," in 2016 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Budapest, Hungary, pp. 1-6, 2016. DOI: 10.1109/CIVEMSA.2016.7524249.   DOI
29 D. B. Megherbi and M. Kim, "A collaborative distributed multi-agent reinforcement learning technique for dynamic agent shortest path planning via selected sub-goals in complex cluttered environments," in 2015 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision, Orlando: FL, USA, pp. 118-124, 2015. DOI: 10.1109/COGSIMA.2015.7108185.   DOI