Table of Contents
Fetching ...

Survey of Data-driven Newsvendor: Unified Analysis and Spectrum of Achievable Regrets

Zhuoxin Chen, Will Ma

TL;DR

This work provides a unified, quantitative analysis of data-driven Newsvendor decisions when the demand distribution is unknown. By introducing the $(\beta,\gamma,\zeta)$-clustered distribution concept, it characterizes how closely the empirical decision tracks the optimal one and derives regret bounds that span the full spectrum from $O(n^{-1/2})$ to $O(n^{-1})$ for high-probability and expectation benchmarks. It also establishes tight additive lower bounds via a Hellinger-distance-based construction, showing that no algorithm can surpass the identified rates across the spectrum. Simulations on common demand distributions validate the theory and reveal crossover behavior driven by sample size and distribution-local clustering. The results offer a complete, data-size-aware picture of learnability for Newsvendor decisions and guide practitioners on when SAA suffices or when robust alternatives are warranted.

Abstract

In the Newsvendor problem, the goal is to guess the number that will be drawn from some distribution, with asymmetric consequences for guessing too high vs. too low. In the data-driven version, the distribution is unknown, and one must work with samples from the distribution. Data-driven Newsvendor has been studied under many variants: additive vs. multiplicative regret, high probability vs. expectation bounds, and different distribution classes. This paper studies all combinations of these variants, filling in many gaps in the literature and simplifying many proofs. In particular, we provide a unified analysis based on the notion of clustered distributions, which in conjunction with our new lower bounds, shows that the entire spectrum of regrets between $1/\sqrt{n}$ and $1/n$ can be possible. Simulations on commonly-used distributions demonstrate that our notion is the "correct" predictor of empirical regret across varying data sizes.

Survey of Data-driven Newsvendor: Unified Analysis and Spectrum of Achievable Regrets

TL;DR

This work provides a unified, quantitative analysis of data-driven Newsvendor decisions when the demand distribution is unknown. By introducing the -clustered distribution concept, it characterizes how closely the empirical decision tracks the optimal one and derives regret bounds that span the full spectrum from to for high-probability and expectation benchmarks. It also establishes tight additive lower bounds via a Hellinger-distance-based construction, showing that no algorithm can surpass the identified rates across the spectrum. Simulations on common demand distributions validate the theory and reveal crossover behavior driven by sample size and distribution-local clustering. The results offer a complete, data-size-aware picture of learnability for Newsvendor decisions and guide practitioners on when SAA suffices or when robust alternatives are warranted.

Abstract

In the Newsvendor problem, the goal is to guess the number that will be drawn from some distribution, with asymmetric consequences for guessing too high vs. too low. In the data-driven version, the distribution is unknown, and one must work with samples from the distribution. Data-driven Newsvendor has been studied under many variants: additive vs. multiplicative regret, high probability vs. expectation bounds, and different distribution classes. This paper studies all combinations of these variants, filling in many gaps in the literature and simplifying many proofs. In particular, we provide a unified analysis based on the notion of clustered distributions, which in conjunction with our new lower bounds, shows that the entire spectrum of regrets between and can be possible. Simulations on commonly-used distributions demonstrate that our notion is the "correct" predictor of empirical regret across varying data sizes.
Paper Structure (44 sections, 7 theorems, 117 equations, 6 figures, 3 tables)

This paper contains 44 sections, 7 theorems, 117 equations, 6 figures, 3 tables.

Key Result

Theorem 2

Fix $q\in(0,1)$ and $\beta\in[0,\infty],\gamma\in(0,\infty),\zeta\in(0,(\min\{q,1-q\})^{\frac{1}{\beta+1}}/\gamma]$. If $\beta<\infty$, then whenever the number of samples satisfies $n>\frac{\log(2/\delta)}{2(\gamma\zeta)^{2\beta+2}}$, we have with probability at least $1-\delta$, for any $\delta\in(0,1)$ and any $(\beta,\gamma,\zeta)$-clustered distribution. If $\beta=\infty$, then whenever the

Figures (6)

  • Figure 1: The CDFs of the distributions with $q=0.4$ and $q=0.9$.
  • Figure 2: Average additive regrets (top) and the values of $\Delta(\varepsilon)$ (bottom) for the distributions under $q=0.4$ and $q=0.9$. Note that all vertical axes are plotted on a logarithmic scale. Also note that the horizontal axes for the $\Delta(\varepsilon)$ plots are $1/\varepsilon^2$, with $\varepsilon$ decreasing as one moves to the right, reflecting the scaling that $\varepsilon$ is roughly $1/\sqrt{n}$.
  • Figure 3: CDFs of the constructed distribution for four specific values of the minimum possible $\beta$ (0, 1, 2, and 3), where the parameters are set as $q = 0.5$, $a^* = 0.5$, $\gamma = 1$, and $\zeta = 0.5$.
  • Figure 4: 95th percentile of additive regrets for the distributions under $q=0.4$ and $q=0.9$.
  • Figure 5: An example where the theory based on minimum PDF around $a^*$ fails to explain the crossing points in regret curves.
  • ...and 1 more figures

Theorems & Definitions (15)

  • Definition 1
  • Theorem 2
  • proof : Proof of \ref{['thm:hpAdd']}
  • Theorem 3
  • proof : Proof of \ref{['thm:hpMult']}
  • Theorem 4
  • proof : Proof of \ref{['thm:expAdd']}
  • Theorem 5
  • proof : Proof of \ref{['thm:expMult']}
  • Theorem 6
  • ...and 5 more