Browse > Article

Minimize Order Picking Time through Relocation of Products in Warehouse Based on Reinforcement Learning  

Kim, Yeojin (Department of Industrial Engineering, Kumoh National Institute of Technology)
Kim, Geuntae (Department of Industrial Engineering, Kumoh National Institute of Technology)
Lee, Jonghwan (Department of Industrial Engineering, Kumoh National Institute of Technology)
Publication Information
Journal of the Semiconductor & Display Technology / v.21, no.2, 2022 , pp. 90-94 More about this Journal
Abstract
In order to minimize the picking time when the products are released from the warehouse, they should be located close to the exit when the products are released. Currently, the warehouse determines the loading location based on the order of the requirement of products, that is, the frequency of arrival and departure. Items with lower requirement ranks are loaded away from the exit, and items with higher requirement ranks are loaded closer from the exit. This is a case in which the delivery time is faster than the products located near the exit, even if the products are loaded far from the exit due to the low requirement ranking. In this case, there is a problem in that the transit time increases when the product is released. In order to solve the problem, we use the idle time of the stocker in the warehouse to rearrange the products according to the order of delivery time. Temporal difference learning method using Q_learning control, which is one of reinforcement learning types, was used when relocating items. The results of rearranging the products using the reinforcement learning method were compared and analyzed with the results of the existing method.
Keywords
Reinforcement Learning; Q_learning; TD (Temporal Difference learning); Relocation; Machine Learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Watkins, C.J.C.H., Dayan, P., "Q-learning", Machine Learning, vol. 8, pp. 279-292, 1992.   DOI
2 Lee W.W., Yang H.R., Kim G.W., Lee Y.M., Lee U.R., "Reinforcement Learning with Python and Keras", wikibooks, 2020.
3 Howard, Ronald, "Dynamic programming and Markov processes", Massachusetts Institute of Technology Press, 1960.
4 Richard S. Sutton, Andrew G. Barto, "Reinforcement Learning: An Introduction Second edition, in progress", Massachusetts Institute of Technology Press, 2015.
5 Moon S.U., Jung D.E., Kim J.H., Cho Y.W., "Comparison of Sliding puzzle agent learning performance through Monte Carlo method and Temporal difference learning (SARSA control, Q-learning control) method", Journal of the institute of Electronics and Information Engineers, pp. 709-712, 2021.
6 Kim S.W., Chung K.S., "The Robust Estimation with Spatial Economics Models using 3-Dimension Weight Matrix considering the Height of the House", Housing Studies Review, vol. 18, pp. 73~92, 2010.