Table of Contents
Fetching ...

Platelet Inventory Management with Approximate Dynamic Programming

Hossein Abouee-Mehrizi, Mahdi Mirjalili, Vahid Sarhangian

TL;DR

This work tackles platelet inventory management under endogenous shelf-life uncertainty by casting the problem as an infinite-horizon discounted MDP with fixed ordering costs. It demonstrates that shelf-life randomness induces non-convexity in the value function and invalidates convexity-based policy properties, motivating a simulation-based Approximate Dynamic Programming (ADP) approach with basis-function value approximation and simulation-based policy iteration. The authors develop a practical online ADP algorithm, benchmark it with an information-relaxation lower bound, and show through extensive numerical studies and a real-HGH case study that the ADP policy substantially reduces expirations and shortages, while remaining computationally tractable for larger instances. The results highlight the value of accounting for shelf-life uncertainty in ordering decisions and demonstrate strong performance against deterministic-shelf-life benchmarks and historical practice, with clear implications for real-world platelet inventory optimization.

Abstract

We study a stochastic perishable inventory control problem with endogenous (decision-dependent) uncertainty in shelf-life of units. Our primary motivation is determining ordering policies for blood platelets. Determining optimal ordering quantities is a challenging task due to the short maximum shelf-life of platelets (3-5 days after testing) and high uncertainty in daily demand. We formulate the problem as an infinite-horizon discounted Markov Decision Process (MDP). The model captures salient features observed in our data from a network of Canadian hospitals and allows for fixed ordering costs. We show that with uncertainty in shelf-life, the value function of the MDP is non-convex and key structural properties valid under deterministic shelf-life no longer hold. Hence, we propose an Approximate Dynamic Programming (ADP) algorithm to find approximate policies. We approximate the value function using a linear combination of basis functions and tune the parameters using a simulation-based policy iteration algorithm. We evaluate the performance of the proposed policy using extensive numerical experiments in parameter regimes relevant to the platelet inventory management problem. We further leverage the ADP algorithm to evaluate the impact of ignoring shelf-life uncertainty. Finally, we evaluate the out-of-sample performance of the ADP algorithm in a case study using real data and compare it to the historical hospital performance and other benchmarks. The ADP policy can be computed online in a few minutes and results in more than 50% lower expiry and shortage rates compared to the historical performance. In addition, it performs better or as well as an exact policy that ignores uncertainty in shelf-life and becomes hard to compute for larger instance of the problem.

Platelet Inventory Management with Approximate Dynamic Programming

TL;DR

This work tackles platelet inventory management under endogenous shelf-life uncertainty by casting the problem as an infinite-horizon discounted MDP with fixed ordering costs. It demonstrates that shelf-life randomness induces non-convexity in the value function and invalidates convexity-based policy properties, motivating a simulation-based Approximate Dynamic Programming (ADP) approach with basis-function value approximation and simulation-based policy iteration. The authors develop a practical online ADP algorithm, benchmark it with an information-relaxation lower bound, and show through extensive numerical studies and a real-HGH case study that the ADP policy substantially reduces expirations and shortages, while remaining computationally tractable for larger instances. The results highlight the value of accounting for shelf-life uncertainty in ordering decisions and demonstrate strong performance against deterministic-shelf-life benchmarks and historical practice, with clear implications for real-world platelet inventory optimization.

Abstract

We study a stochastic perishable inventory control problem with endogenous (decision-dependent) uncertainty in shelf-life of units. Our primary motivation is determining ordering policies for blood platelets. Determining optimal ordering quantities is a challenging task due to the short maximum shelf-life of platelets (3-5 days after testing) and high uncertainty in daily demand. We formulate the problem as an infinite-horizon discounted Markov Decision Process (MDP). The model captures salient features observed in our data from a network of Canadian hospitals and allows for fixed ordering costs. We show that with uncertainty in shelf-life, the value function of the MDP is non-convex and key structural properties valid under deterministic shelf-life no longer hold. Hence, we propose an Approximate Dynamic Programming (ADP) algorithm to find approximate policies. We approximate the value function using a linear combination of basis functions and tune the parameters using a simulation-based policy iteration algorithm. We evaluate the performance of the proposed policy using extensive numerical experiments in parameter regimes relevant to the platelet inventory management problem. We further leverage the ADP algorithm to evaluate the impact of ignoring shelf-life uncertainty. Finally, we evaluate the out-of-sample performance of the ADP algorithm in a case study using real data and compare it to the historical hospital performance and other benchmarks. The ADP policy can be computed online in a few minutes and results in more than 50% lower expiry and shortage rates compared to the historical performance. In addition, it performs better or as well as an exact policy that ignores uncertainty in shelf-life and becomes hard to compute for larger instance of the problem.
Paper Structure (33 sections, 1 theorem, 35 equations, 17 figures, 7 tables, 1 algorithm)

This paper contains 33 sections, 1 theorem, 35 equations, 17 figures, 7 tables, 1 algorithm.

Key Result

Proposition EC.1

Assume $m=3$, uncertainty in shelf-life is exogenous, demand is uniformly distributed, and consider the class of policies that are linear in the state variables $x_1,x_2$. Denote the value function in iteration $n$ of the value iteration algorithm by $v^n(\tau,\textbf{x})$ with $v^0(\tau, \textbf{x}

Figures (17)

  • Figure 1: Comparing the expected cost function, value function, and optimal policy obtained under the fixed ordering cost of $\kappa=10$ and endogenous shelf-life uncertainty (left column) with those obtained under zero fixed ordering cost and endogenous (middle column) or deterministic shelf-life (right column) assuming $h=1, \, l =20, \, \theta = 5$ and the demand and other parameters are the same across columns.
  • Figure 2: Effect of order size coefficients in the multinomiallogistic model on the remaining shelf-life distribution for order sizes of 0, 5, 10, and 15. The larger absolute magnitude of positive (negative) coefficients decreases (increases) the probability of receiving units with remaining shelf-life of one day more rapidly.
  • Figure 3: Performance of candidate basis functions with respect to their MAPE calculated using the optimal value function.
  • Figure 4: Relative optimality gap of the ADP policy. The black line presents the estimate of the expected optimality gap in each iteration and the gray area is the corresponding 95% confidence interval. On average, the estimate of the expected optimality gap among all cases is 1.8%, indicating an 80% reduction compared to 8.9% of the initial Myopic solution at iteration zero.
  • Figure 5: Performance improvement for cases with more than 5% optimality gaps in Fig \ref{['fig:ADPerf']}. The interaction term $x_1x_2$ can further improve the results compared to the cubic terms $x_1^3$ and $x_2^3$.
  • ...and 12 more figures

Theorems & Definitions (1)

  • Proposition EC.1