DOI QR코드

DOI QR Code

멀티에이전트 기반 Deep Q-Network 모델을 이용한 동적 미사일 방어효과 개선

Improving Dynamic Missile Defense Effectiveness Using Multi-Agent Deep Q-Network Model

  • 투고 : 2024.03.08
  • 심사 : 2024.06.03
  • 발행 : 2024.06.30

초록

The threat of North Korea's long-range firepower is recognized as a typical asymmetric threat, and South Korea is prioritizing the development of a Korean-style missile defense system to defend against it. To address this, previous research modeled North Korean long-range artillery attacks as a Markov Decision Process (MDP) and used Approximate Dynamic Programming as an algorithm for missile defense, but due to its limitations, there is an intention to apply deep reinforcement learning techniques that incorporate deep learning. In this paper, we aim to develop a missile defense system algorithm by applying a modified DQN with multi-agent-based deep reinforcement learning techniques. Through this, we have researched to ensure an efficient missile defense system can be implemented considering the style of attacks in recent wars, such as how effectively it can respond to enemy missile attacks, and have proven that the results learned through deep reinforcement learning show superior outcomes.

키워드

과제정보

This study has been partially supported by industry-academic research of Hannam University and Hanwha System.

참고문헌

  1. Bertsekas, D., Homer, M., Logan, D., Patek, S., and Sandell, N., Missile Defense and Interceptor Allocation by Neuro-Dynamic Programming, IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 2000, Vol. 30, No. 1, pp. 42-51. https://doi.org/10.1109/3468.823480
  2. Cha, Y.H. and Jeong, B., Exact Algorithm for the Weapon Target Assignment and Fire Scheduling Problem, Journal of the Society of Korea Industrial and Systems Engineering, 2019, Vol. 42, No. 1, pp. 143-150. https://doi.org/10.11627/jkise.2019.42.1.143
  3. Davis, M.T., Robbins, M.J., and Lunday, B.J., Approximate Dynamic Programming for Missile Defense Interceptor Fire Control, European Journal of Operational Research, 2017, Vol. 259, pp. 873-886. https://doi.org/10.1016/j.ejor.2016.11.023
  4. Im, J.S., Yoo, B.C., Kim, J.H., and Choi, B.W., A Study of Multi-to-Majority Response on Threat Assessment and Weapon Assignment Algorithm: by Adjusting Ballistic Missiles and Long-Range Artillery Threat, Journal of Korean Society of Industrial and Systems Engineering, 2021, Vol. 44, No. 4, pp. 43-52. https://doi.org/10.11627/jksie.2021.44.4.043
  5. Jang, B.C. and Kwon, H.J.., Consideration on Our Asymmetric Response through the Israel-Hamas Surprise Attack, Defense & Technology, 2023, Vol. 538, pp. 116-125.
  6. Jang, J.G., Kim, K., Choi, B.W., and Suh, J.J., A Linear Approximation Model for an Asset-based Weapon Target Assignment Problem, Journal Society of Korea Industrial and System Engineering, 2015, Vol. 38, No. 3, pp. 108-116. https://doi.org/10.11627/jkise.2015.38.3.108
  7. Jung, J.K., Uhm, H.S., and Lee, Y.H., Rolling - Horizon Scheduling Algorithm for Dynamic Weapon - Target Assignment in Air Defense Engagement, Journal of the Korean Institute of Industrial Engineering, 2020, Vol. 46, No. 1, pp. 11-24. https://doi.org/10.7232/JKIIE.2020.46.1.011
  8. Kim, H.H., Kim, J.H., Kong, J.H., and Gyeong, J.H., Reinforcement Learning-based Dynamic Weapon Allocation for Multiple Long-range Artillery Attacks, Journal of the Korean Institute of Industrial Management Systems, 2022, Vol. 45, No. 4, pp. 42-52. https://doi.org/10.11627/jksie.2022.45.4.042
  9. Kim, J.H., Kim, K., Choi, B.W., and Suh, J.J., An Application of Quantum-inspired Genetic Algorithm for Weapon Target Assignment Problem, Journal Society of Korea Industrial and System Engineering, 2017, Vol. 40, No. 4, pp. 260-267. https://doi.org/10.11627/jkise.2017.40.4.260
  10. Lee, C.S., Kim, J.H., Choi, B.W., and Kim, K.T., Approximate Dynamic Programming Based Interceptor Fire Control and Effectiveness Analysis for M-To-M Engagement, Journal of the Korean Society for Aeronautical & Space Sciences, 2022, Vol. 50, No. 4, pp. 287-295. https://doi.org/10.5139/JKSAS.2022.50.4.287
  11. Lee, W.W., Yang, H.R., Kim, G.W., Lee, Y.M., and Lee, E.R., Reinforcement Learning with Python and Keras (Revised Edition), Published April 7, 2020, pp. 227-247.
  12. Lee, Z.J., Lee, C.Y., and Su, S.F., An Immunity Based Ant Colony Optimization Algorithm for Solving Weapon-Target Assignment Problem, Applied Soft Computing, 2002, Vol. 2, No. 1, pp. 39-47. https://doi.org/10.1016/S1568-4946(02)00027-3
  13. Li, S.E., Deep Reinforcement Learning, Reinforcement Learning for Sequential Decision and Optimal Control, Singapore: Springer Nature Singapore, 2023, pp. 365-402.
  14. Li, Y., Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274, 2017, pp. 5-28.
  15. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M., Playing Atari with Deep Reinforcement Learning, arXiv preprint arXiv:1312.5602 , 2013, pp. 2-5.
  16. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., and Hassabis, D., Human-level Control Through Deep Reinforcement Learning, Nature, 2015, Vol. 518, No. 7540, pp. 529-533. https://doi.org/10.1038/nature14236
  17. Naeem, H. and Masood, A., An Optimal Dynamic Threat Evaluation and Weapon Scheduling Technique, Know-ledge-Based Systems, 2010, Vol. 23, No. 4, pp. 337-342. https://doi.org/10.1016/j.knosys.2009.11.012
  18. Park, Y.W., and Jung, J.W., Formulation of a Defense Artificial Intelligence Development Plan, Korean Society for Defense Technology, Dec. 2020, pp. 3-8.
  19. Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, 2011, Second Edition, John Wiley & Sons, Hoboken, NJ., pp. 315-346.
  20. Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, Second Edition, 2011, John Wiley & Sons, Hoboken, NJ., pp. 235-276.
  21. Powell, W.B., Perspectives of Approximate Dynamic Programming, Annals of Operations Research, 2012, Vol. 13, No. 2, pp. 1-38. https://doi.org/10.1007/s10479-012-1077-6
  22. Schulman, J., Levine, S., Abbeel, P., Jordan, M., and Moritz, P., Trust Region Policy Optimization, In Proceedings of The 32nd International Conference on Machine Learning, 2015, pp. 1-9.
  23. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O., Proximal Policy Optimization Algorithms, arXiv preprint arXiv:1707.06347, 2017, pp. 365-402.
  24. segye news, https://www.segye.com/newsView/20231102526999 (accessed 2023/2/7).
  25. Shin, M.K., Park, S.-S., Lee, D., and Choi, H.-L., Mean Field Game based Reinforcement Learning for WeaponTarget Assignment, Journal of the Korea Institute of Military Science and Technology, 2020, Vol. 23, No. 4, pp. 337-345. https://doi.org/10.9766/KIMST.2020.23.4.337
  26. Summers, D.S., Robbins, M.J., and Lunday, B.J., An Approximate Dynamic Programming for Comparing Firing Policies in a Networked Air Defense Environment, Computers & Operations Research, 2020, Vol. 117, pp. 1-29. https://doi.org/10.1016/j.cor.2020.104890
  27. Sutton, R.S. and Barto, A.G., Reinforcement learning: An introduction, 2nd ed., 2018, pp. 30-39.
  28. Tutorials for Reinforcement Learning, https://tutorials.pytorch.kr/intermediate/reinforcement_q_learning.html (accessed 2024/1/5).
  29. Yonhapnews, https://www.yna.co.kr/view/AKR20220410019151504 (accessed 2023/2/7).
  30. Yonhapnews, https://www.yna.co.kr/view/MYH20231012022600641 (accessed 2024/2/7).