Table of Contents
Fetching ...

Non-Stationary Inventory Control with Lead Times

Nele H. Amiri, Sean R. Sinclair, Maximiliano Udenio

TL;DR

This work analyzes how demand non-stationarity affects learning performance across inventory models, including systems with demand backlogging or lost-sales, both with and without lead times, and proposes an adaptive online algorithm that optimizes over the class of base-stock policies and establishes performance guarantees.

Abstract

We study non-stationary single-item, periodic-review inventory control problems in which the demand distribution is unknown and may change over time. We analyze how demand non-stationarity affects learning performance across inventory models, including systems with demand backlogging or lost-sales, both with and without lead times. For each setting, we propose an adaptive online algorithm that optimizes over the class of base-stock policies and establish performance guarantees in terms of dynamic regret relative to the optimal base-stock policy at each time step. Our results reveal a sharp separation across inventory models. In backlogging systems and lost-sales models with zero lead time, we show that it is possible to adapt to demand changes without incurring additional performance loss in stationary environments, even without prior knowledge of the demand distributions or the number of demand shifts. In contrast, for lost-sales systems with positive lead times, we establish weaker guarantees that reflect fundamental limitations imposed by delayed replenishment in combination with censored feedback. Our algorithms leverage the convexity and one-sided feedback structure of inventory costs to enable counterfactual policy evaluation despite demand censoring. We complement the theoretical analysis with simulation results showing that our methods significantly outperform existing benchmarks.

Non-Stationary Inventory Control with Lead Times

TL;DR

This work analyzes how demand non-stationarity affects learning performance across inventory models, including systems with demand backlogging or lost-sales, both with and without lead times, and proposes an adaptive online algorithm that optimizes over the class of base-stock policies and establishes performance guarantees.

Abstract

We study non-stationary single-item, periodic-review inventory control problems in which the demand distribution is unknown and may change over time. We analyze how demand non-stationarity affects learning performance across inventory models, including systems with demand backlogging or lost-sales, both with and without lead times. For each setting, we propose an adaptive online algorithm that optimizes over the class of base-stock policies and establish performance guarantees in terms of dynamic regret relative to the optimal base-stock policy at each time step. Our results reveal a sharp separation across inventory models. In backlogging systems and lost-sales models with zero lead time, we show that it is possible to adapt to demand changes without incurring additional performance loss in stationary environments, even without prior knowledge of the demand distributions or the number of demand shifts. In contrast, for lost-sales systems with positive lead times, we establish weaker guarantees that reflect fundamental limitations imposed by delayed replenishment in combination with censored feedback. Our algorithms leverage the convexity and one-sided feedback structure of inventory costs to enable counterfactual policy evaluation despite demand censoring. We complement the theoretical analysis with simulation results showing that our methods significantly outperform existing benchmarks.
Paper Structure (55 sections, 22 theorems, 138 equations, 5 figures, 3 tables, 3 algorithms)

This paper contains 55 sections, 22 theorems, 138 equations, 5 figures, 3 tables, 3 algorithms.

Key Result

Lemma 1

The function $\mu_t(\tau)$ is Lipschitz continuous in $\tau$ with factor $\max\{h, b\}$.

Figures (5)

  • Figure 1:
  • Figure 2: The graphs of two convex functions $\mu_t$ and $\mu_{t+1}$, before and after a change. Their minima shift from $\tau_t^*$ to $\tau_{t+1}^*$, with ${\tau_t^* < \tau_v^k < \tau_{t+1}^*}$. By convexity, the function values cannot coincide at both points $\tau_t^*$ and $\tau_v^k$. We can exploit this property because the costs of both policies are observed (solid lines), as opposed to policies ${\tau > \tau_v^k}$ (dashed lines).
  • Figure 3: Dynamic regret $R(T)$ vs. $S$ of NSIC-BL under backlogging (left), and NSIC-LS or NSIC-LSL under lost-sales (right), shown on a log-log scale and for different values of lead time $L$. In \ref{['fig:simulations_regret_vs_S_normal']} demand is drawn from the truncated normal distribution and in \ref{['fig:simulations_regret_vs_S_uniform']} from the uniform distribution.
  • Figure 4: Dynamic regret $R(T)$ vs. lead time $L$ of NSIC-BL under backlogging (left), and NSIC-LS or NSIC-LSL under lost-sales (right), shown for different values of $S$. In \ref{['fig:simulations_regret_vs_L_normal']} demand is drawn from the truncated normal distribution and in \ref{['fig:simulations_regret_vs_L_uniform']} from the uniform distribution.
  • Figure 5:

Theorems & Definitions (27)

  • Definition 1: Dynamic Regret
  • Definition 2: Non-Stationarity Measure
  • Lemma 1: Lipschitz Property
  • Lemma 2: Convexity
  • Lemma 3: Counterfactual Policy Evaluation
  • Lemma 4: Concentration of Expected Empirical Cost and Expected Asymptotic Cost
  • Theorem 1
  • Remark 1
  • Theorem 2
  • Theorem 3
  • ...and 17 more