DOI QR코드

DOI QR Code

Real-Time Scheduling Scheme based on Reinforcement Learning Considering Minimizing Setup Cost

작업 준비비용 최소화를 고려한 강화학습 기반의 실시간 일정계획 수립기법

  • Yoo, Woosik (Dept. of Industrial and Management Engineering, Incheon National University) ;
  • Kim, Sungjae (Dept. of Industrial and Management Engineering, Incheon National University) ;
  • Kim, Kwanho (Dept. of Industrial and Management Engineering, Incheon National University)
  • Received : 2019.10.29
  • Accepted : 2020.03.24
  • Published : 2020.05.31

Abstract

This study starts with the idea that the process of creating a Gantt Chart for schedule planning is similar to Tetris game with only a straight line. In Tetris games, the X axis is M machines and the Y axis is time. It is assumed that all types of orders can be worked without separation in all machines, but if the types of orders are different, setup cost will be incurred without delay. In this study, the game described above was named Gantris and the game environment was implemented. The AI-scheduling table through in-depth reinforcement learning compares the real-time scheduling table with the human-made game schedule. In the comparative study, the learning environment was studied in single order list learning environment and random order list learning environment. The two systems to be compared in this study are four machines (Machine)-two types of system (4M2T) and ten machines-six types of system (10M6T). As a performance indicator of the generated schedule, a weighted sum of setup cost, makespan and idle time in processing 100 orders were scheduled. As a result of the comparative study, in 4M2T system, regardless of the learning environment, the learned system generated schedule plan with better performance index than the experimenter. In the case of 10M6T system, the AI system generated a schedule of better performance indicators than the experimenter in a single learning environment, but showed a bad performance index than the experimenter in random learning environment. However, in comparing the number of job changes, the learning system showed better results than those of the 4M2T and 10M6T, showing excellent scheduling performance.

본 연구는 일정계획을 위한 간트 차트(Gantt Chart) 생성과정을 세로로 세우면 일자형만 존재하는 테트리스(Tetris) 게임과 유사하다는 아이디어에서 출발하였다. 테트리스 게임에서 X축은 M개의 설비(Machine)들이 되고 Y축은 시간이 된다. 모든 설비에서 모든 종류(Type)의 주문은 분리 없이 작업 가능하나 작업물 종류가 다를 경우에는 시간지체 없이 작업 준비비용(SetupCost)이 발생한다는 가정이다. 본 연구에서는 앞에서 설명한 게임을 간트리스(Gantris)라 명명하고 게임환경을 구현 하였으며, 심층 강화학습을 통해서 학습한 인공지능이 실시간 스케줄링한 일정계획과 인간이 실시간으로 게임을 통해 수립한 일정계획을 비교하였다. 비교연구에서 학습환경은 단일 주문목록 학습환경과 임의 주문목록 학습환경에서 학습하였다. 본 연구에서 수행한 비교대상 시스템은 두 가지로 4개의 머신(Machine)-2개의 주문 종류(Type)가 있는 시스템(4M2T)과 10개의 머신-6개의 주문종류가 있는 시스템(10M6T)이다. 생성된 일정계획의 성능지표로는 100개의 주문을 처리하는데 발생하는 Setup Cost, 총 소요 생산시간(makespan)과 유휴가공시간(idle time)의 가중합이 활용되었다. 비교연구 결과 4M2T 시스템에서는 학습환경에 관계없이 학습된 시스템이 실험자보다 성능지표가 우수한 일정계획을 생성하였다. 10M6T 시스템의 경우 제안한 시스템이 단일 학습환경에서는 실험자보다 우수한 성능 지표의 일정계획을 생성하였으나 임의 학습환경에서는 실험자보다 부진한 성능지표를 보였다. 그러나 job Change 횟수 비교에서는 학습시스템이 4M2T, 10M6T 모두 사람보다 적은 결과를 나타내어 우수한 스케줄링 성능을 보였다.

Keywords

References

  1. Beck, J. E. and Woolf, B. P., High-Level Student Modeling with Machine Learning, In: Gauthier, G., Frasson, C., VanLehn, K.(eds) Intelligent Tutoring Systems, ITS 2000, Lecture Notes in Computer Science, Vol, 1839, 2000, Springer, Berlin, Heidelberg.
  2. Jeong et al., "An Neural Network Approach to Job-shop Scheduling based on Reinforcement Learning," Proceedings of the Korean Society of Computer Information Conference, pp. 47-48, 2018.
  3. Kaplan, R., Sauer, C., and Sosa, A., "Beating atari with natural language guided reinforcement learning," arXiv preprint arXiv:1704.05539, 2017.
  4. Kim, A., "Ensemble-based Quality Classification and Deep Reinforcement Learning-based Production Scheduling: Ensemble-based Quality Classification and Deep Reinforcement Learning-based Production Scheduling," KyungHee University, Ph.D. Thesis, 2018.
  5. Kim, J., "Packet scheduling algorithm using deep Q-Network in wireless network", Yonsei University, M.S Thesis, 2018.
  6. Lee, S.-H., "Reinforcement Learning based AGV Scheduling," The Korean Society of Computer And Information, Proceedings of the Korean Society of Computer Information Conference, pp. 23-24, 2018.
  7. Lin, C.-C., Deng, D.-J., Chih, Y.-L., and Chiu, H.-T., "Smart Manufacturing Scheduling with Edge Computing Using Multi-Class Deep q Network", IEEE Transactions on Industrial Informatics, 2019.
  8. Mnih et al., "Human-level control through deep reinforcement learning", Nature, Vol. 518, No. 7540, pp. 529-533, 2015. https://doi.org/10.1038/nature14236
  9. Mnih et al., "Playing Atari with Deep Reinforcement Learning", Proceedings of the NIPS Deep Learning Workshop, 2013.
  10. Park, I.-B., Huh, J., Kim, J., and Park, J., "A reinforcement learning approach to robust scheduling of semiconductor manufacturing facilities," IEEE Transactions on Automation Science and Engineering, 2019.
  11. Schrum, J., Evolving Indirectly Encoded Convolutional Neural Networks to Play Tetris With Low-Level Features, Proceedings of the Genetic and Evolutionary Computation Conference, (GECCO), 2018.
  12. Shahrabi, J., Adibi, M. A., and Mahootchi, M., "A reinforcement learning approach to parameter estimation in dynamicjob shop scheduling," Computers & Industrial Engineering, Vol. 110, pp. 75-82, 2017. https://doi.org/10.1016/j.cie.2017.05.026
  13. Silver et al., "Mastering the game of Go with deep neural networks and tree search," Nature, Vol. 529, pp. 484-489, 2016. https://doi.org/10.1038/nature16961
  14. Waschneck et al., "Deep reinforcement learning for semiconductor production scheduling", Proc. 29th. Annu. SEMI Adv. Semicond. Manuf. Conf, 2018.
  15. Yoo, W., Seo, J., Kim, D., and Kim, K., "Machine scheduling models based on reinforcement learning for minimizing due date violation and setup change," The Journal of Society for e-Business Studies, Vol. 24, No. 3, pp. 19-33, 2019.