Minimize Order Picking Time through Relocation of Products in Warehouse Based on Reinforcement Learning

Kim, Yeojin;Kim, Geuntae;Lee, Jonghwan;

반도체디스플레이기술학회지 (Journal of the Semiconductor & Display Technology)

제21권2호
/
Pages.90-94
/
2022
/
1738-2270(pISSN)

한국반도체디스플레이기술학회 (The Korean Society Of Semiconductor & Display Technology)

물품 출고 시간 최소화를 위한 강화학습 기반 적재창고 내 물품 재배치

Minimize Order Picking Time through Relocation of Products in Warehouse Based on Reinforcement Learning

김여진 (금오공과대학교 산업공학과) ;
김근태 (금오공과대학교 산업공학과) ;
이종환 (금오공과대학교 산업공학과)

Kim, Yeojin (Department of Industrial Engineering, Kumoh National Institute of Technology) ;
Kim, Geuntae (Department of Industrial Engineering, Kumoh National Institute of Technology) ;
Lee, Jonghwan (Department of Industrial Engineering, Kumoh National Institute of Technology)

투고 : 2022.06.07
심사 : 2022.06.22
발행 : 2022.06.30

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

In order to minimize the picking time when the products are released from the warehouse, they should be located close to the exit when the products are released. Currently, the warehouse determines the loading location based on the order of the requirement of products, that is, the frequency of arrival and departure. Items with lower requirement ranks are loaded away from the exit, and items with higher requirement ranks are loaded closer from the exit. This is a case in which the delivery time is faster than the products located near the exit, even if the products are loaded far from the exit due to the low requirement ranking. In this case, there is a problem in that the transit time increases when the product is released. In order to solve the problem, we use the idle time of the stocker in the warehouse to rearrange the products according to the order of delivery time. Temporal difference learning method using Q_learning control, which is one of reinforcement learning types, was used when relocating items. The results of rearranging the products using the reinforcement learning method were compared and analyzed with the results of the existing method.

키워드

과제정보

이 논문은 2021년도 정부(과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임(No. 202101350001, 딥러닝 기반 스마트 자동물류 적재창고 기술개발).

참고문헌

Watkins, C.J.C.H., Dayan, P., "Q-learning", Machine Learning, vol. 8, pp. 279-292, 1992. https://doi.org/10.1007/BF00992698
Howard, Ronald, "Dynamic programming and Markov processes", Massachusetts Institute of Technology Press, 1960.
Richard S. Sutton, Andrew G. Barto, "Reinforcement Learning: An Introduction Second edition, in progress", Massachusetts Institute of Technology Press, 2015.
Lee W.W., Yang H.R., Kim G.W., Lee Y.M., Lee U.R., "Reinforcement Learning with Python and Keras", wikibooks, 2020.
Moon S.U., Jung D.E., Kim J.H., Cho Y.W., "Comparison of Sliding puzzle agent learning performance through Monte Carlo method and Temporal difference learning (SARSA control, Q-learning control) method", Journal of the institute of Electronics and Information Engineers, pp. 709-712, 2021.
Kim S.W., Chung K.S., "The Robust Estimation with Spatial Economics Models using 3-Dimension Weight Matrix considering the Height of the House", Housing Studies Review, vol. 18, pp. 73~92, 2010.