Platelet Inventory Management with Approximate Dynamic Programming
Hossein Abouee-Mehrizi, Mahdi Mirjalili, Vahid Sarhangian
TL;DR
This work tackles platelet inventory management under endogenous shelf-life uncertainty by casting the problem as an infinite-horizon discounted MDP with fixed ordering costs. It demonstrates that shelf-life randomness induces non-convexity in the value function and invalidates convexity-based policy properties, motivating a simulation-based Approximate Dynamic Programming (ADP) approach with basis-function value approximation and simulation-based policy iteration. The authors develop a practical online ADP algorithm, benchmark it with an information-relaxation lower bound, and show through extensive numerical studies and a real-HGH case study that the ADP policy substantially reduces expirations and shortages, while remaining computationally tractable for larger instances. The results highlight the value of accounting for shelf-life uncertainty in ordering decisions and demonstrate strong performance against deterministic-shelf-life benchmarks and historical practice, with clear implications for real-world platelet inventory optimization.
Abstract
We study a stochastic perishable inventory control problem with endogenous (decision-dependent) uncertainty in shelf-life of units. Our primary motivation is determining ordering policies for blood platelets. Determining optimal ordering quantities is a challenging task due to the short maximum shelf-life of platelets (3-5 days after testing) and high uncertainty in daily demand. We formulate the problem as an infinite-horizon discounted Markov Decision Process (MDP). The model captures salient features observed in our data from a network of Canadian hospitals and allows for fixed ordering costs. We show that with uncertainty in shelf-life, the value function of the MDP is non-convex and key structural properties valid under deterministic shelf-life no longer hold. Hence, we propose an Approximate Dynamic Programming (ADP) algorithm to find approximate policies. We approximate the value function using a linear combination of basis functions and tune the parameters using a simulation-based policy iteration algorithm. We evaluate the performance of the proposed policy using extensive numerical experiments in parameter regimes relevant to the platelet inventory management problem. We further leverage the ADP algorithm to evaluate the impact of ignoring shelf-life uncertainty. Finally, we evaluate the out-of-sample performance of the ADP algorithm in a case study using real data and compare it to the historical hospital performance and other benchmarks. The ADP policy can be computed online in a few minutes and results in more than 50% lower expiry and shortage rates compared to the historical performance. In addition, it performs better or as well as an exact policy that ignores uncertainty in shelf-life and becomes hard to compute for larger instance of the problem.
