Table of Contents
Fetching ...

From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms

Omar Besbes, Will Ma, Omar Mouchtaki

TL;DR

The paper addresses how the quantity and relevance of historical contextual data influence decisions in a contextual Newsvendor setting. By introducing a local condition linking context dissimilarity to distributional proximity via the Kolmogorov distance, and focusing on Weighted ERM policies, it derives exact worst-case regret characterizations through an optimization-based framework termed learning without concentration. The key contributions include reducing the infinite-dimensional problem to a one-dimensional line search, proving Bernoulli distributions are worst-case for separable policies, and revealing that ERM can exhibit non-monotone, oscillatory learning curves with an identifiable effective sample size and near-optimal k*-ERM† improvements. These findings challenge concentration-based guarantees, showing that data relevance can dominate data quantity and informing practical policy design under time-varying data or drift. The work thus provides sharp, context-aware performance guarantees with clear implications for data usage and algorithm design in contextual decision-making systems.

Abstract

In this work, we study how the relevance/quality and quantity of past data influence performance by analyzing a contextual Newsvendor problem, in which a decision-maker trades off between underage and overage costs under uncertain demand. We consider a setting in which past demands observed under ``close by'' contexts come from close by distributions and analyze the performance of data-driven algorithms through a notion of context-dependent worst-case expected regret. We analyze the broad class of Weighted Empirical Risk Minimization (WERM) policies which weigh past data according to their similarity in the contextual space. This class includes classical policies such as ERM, k-Nearest Neighbors and kernel-based policies. Our main methodological contribution is to characterize exactly the worst-case regret of any WERM policy on any given configuration of contexts. To the best of our knowledge, this provides the first understanding of tight performance guarantees in any contextual decision-making problem, with past literature focusing on upper bounds via concentration inequalities. We instead take an optimization approach, and isolate a structure in the Newsvendor loss function that allows to reduce the infinite-dimensional optimization problem over worst-case distributions to a simple line search. This in turn allows us to unveil fundamental insights that were obfuscated by previous general-purpose bounds. We characterize actual guaranteed performance as a function of the contexts, as well as granular insights on the learning curve of algorithms.

From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms

TL;DR

The paper addresses how the quantity and relevance of historical contextual data influence decisions in a contextual Newsvendor setting. By introducing a local condition linking context dissimilarity to distributional proximity via the Kolmogorov distance, and focusing on Weighted ERM policies, it derives exact worst-case regret characterizations through an optimization-based framework termed learning without concentration. The key contributions include reducing the infinite-dimensional problem to a one-dimensional line search, proving Bernoulli distributions are worst-case for separable policies, and revealing that ERM can exhibit non-monotone, oscillatory learning curves with an identifiable effective sample size and near-optimal k*-ERM† improvements. These findings challenge concentration-based guarantees, showing that data relevance can dominate data quantity and informing practical policy design under time-varying data or drift. The work thus provides sharp, context-aware performance guarantees with clear implications for data usage and algorithm design in contextual decision-making systems.

Abstract

In this work, we study how the relevance/quality and quantity of past data influence performance by analyzing a contextual Newsvendor problem, in which a decision-maker trades off between underage and overage costs under uncertain demand. We consider a setting in which past demands observed under ``close by'' contexts come from close by distributions and analyze the performance of data-driven algorithms through a notion of context-dependent worst-case expected regret. We analyze the broad class of Weighted Empirical Risk Minimization (WERM) policies which weigh past data according to their similarity in the contextual space. This class includes classical policies such as ERM, k-Nearest Neighbors and kernel-based policies. Our main methodological contribution is to characterize exactly the worst-case regret of any WERM policy on any given configuration of contexts. To the best of our knowledge, this provides the first understanding of tight performance guarantees in any contextual decision-making problem, with past literature focusing on upper bounds via concentration inequalities. We instead take an optimization approach, and isolate a structure in the Newsvendor loss function that allows to reduce the infinite-dimensional optimization problem over worst-case distributions to a simple line search. This in turn allows us to unveil fundamental insights that were obfuscated by previous general-purpose bounds. We characterize actual guaranteed performance as a function of the contexts, as well as granular insights on the learning curve of algorithms.
Paper Structure (33 sections, 18 theorems, 129 equations, 8 figures, 6 tables)

This paper contains 33 sections, 18 theorems, 129 equations, 8 figures, 6 tables.

Key Result

Theorem 1

Assume the data-generation process satisfies the local condition (see def:local_const). For any sample size and any historical contexts, the problem of determining the worst-case regret of any Weighted ERM policy can be reduced from a non-convex constrained infinite-dimensional optimization problem

Figures (8)

  • Figure 1: Illustrative summary of our main insights. We note that the gray curve depicts the shape of "previous bounds" on the same scale as the bounds we derive; the actual previous bounds are in fact much higher.
  • Figure 2: Classes of policies analyzed. The figure represents the different classes of policies that we analyze and the relationships that we show in terms of containment.
  • Figure 3: Performance of ERM and alternative policies. (a) The figure depicts the worst-case regret of the ERM policy as implied by the bound presented by mohri2012new and by ours. (b) The figure compares the performance of ERM and $k^*\text{-}\mathrm{ERM}^{\dagger}$ to the lower bound achievable by any data-driven policy (see \ref{['rem:lb']}) . In these plots, $\zeta = .1$ and $q=.9$.
  • Figure 4: Comparison of ERM and $\mathrm{ERM}^{\dagger}$. The figure depicts the worst-case regret of the Empirical Risk Minimization and $\mathrm{ERM}^{\dagger}$ policies for $\zeta = .1$ and as a function of the sample size $n$ (q=.9).
  • Figure 5: Performance of ERM for different instances $(\zeta = 0.3)$. Each figure depicts the regret of ERM for a fixed out-of-sample distribution as a function of the number of samples. The cumulative minimum curve corresponds to the lowest regret achieved by ERM by using at most $n$ samples $(q = 0.9)$.
  • ...and 3 more figures

Theorems & Definitions (41)

  • Theorem : Main result, informal version
  • Definition 1: Local condition
  • Remark 1: Partially Observed Contexts
  • Definition 2: Weighted Empirical Risk Minimization for Newsvendor
  • Lemma 1
  • Definition 3: Separable policies
  • Lemma 2
  • Proposition 1
  • Proposition 2
  • Theorem 1
  • ...and 31 more