Reinforcement Learning for the Perishable Inventory Control Problem
Taner Bilgiç, Boğaziçi University, Dept of Industrial Engineering
Managing perishable inventory effectively is critical in diverse sectors like, pharmaceuticals, composite materials, agriculture, blood, and grocery. The challenge lies in reducing costs while handling items with limited shelf lives and changing demand and/or costs. In this study, we explore the potential of reinforcement learning methods to address this problem. The recourse to reinforcement learning (RL) is unnecessary when the control problem is stationary and there is no model uncertainty. But as soon as one of those "taming assumptions" is relaxed, obtaining structural results becomes much more difficult. Computationally solving the problem also quickly faces the curse of dimensionality. We start with the well-established recursive formulation of the perishable inventory control problem and use dynamic programming algorithms (Q-value) to computationally solve instances of small tractable problems to optimality. Then we apply two RL algorithms: Q-learning and SARSA on these instances assuming state transition probabilities are not known. This comparison gives us a sense of how RL performs on problems for which we know the optimal solution. Then we introduce non-stationary demand scenarios and increase the shelf life of products. The latter extension increases the state space exponentially under which the optimal solution is intractable. RL algorithms produce sensible policies in both cases with very little extra computational effort. We also discuss more intricate versions of the problem where RL seems promising and describe avenues for further research.
Joint work with Ahmet Sualp Say.
Taner Bilgiç received BSc and MSc degrees from the Industrial Engineering Department of METU in 1987 and 1990, respectively. He received his Ph.D. from the University of Toronto in 1995. After spending two years as a post-doctoral research fellow in the same university, he joined Boğaziçi University as a faculty member in 1997. He is still a professor at Boğaziçi University’s Industrial Engineering Department. His research interests include supply chain management, platform economics, graphical models in causal reasoning and decision theory.
Friday, October 20, 2023, 4.00 pm - IE03 Halim Doğrusöz Auditorium