From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms
Omar Besbes, Will Ma, Omar Mouchtaki
TL;DR
The paper addresses how the quantity and relevance of historical contextual data influence decisions in a contextual Newsvendor setting. By introducing a local condition linking context dissimilarity to distributional proximity via the Kolmogorov distance, and focusing on Weighted ERM policies, it derives exact worst-case regret characterizations through an optimization-based framework termed learning without concentration. The key contributions include reducing the infinite-dimensional problem to a one-dimensional line search, proving Bernoulli distributions are worst-case for separable policies, and revealing that ERM can exhibit non-monotone, oscillatory learning curves with an identifiable effective sample size and near-optimal k*-ERM† improvements. These findings challenge concentration-based guarantees, showing that data relevance can dominate data quantity and informing practical policy design under time-varying data or drift. The work thus provides sharp, context-aware performance guarantees with clear implications for data usage and algorithm design in contextual decision-making systems.
Abstract
In this work, we study how the relevance/quality and quantity of past data influence performance by analyzing a contextual Newsvendor problem, in which a decision-maker trades off between underage and overage costs under uncertain demand. We consider a setting in which past demands observed under ``close by'' contexts come from close by distributions and analyze the performance of data-driven algorithms through a notion of context-dependent worst-case expected regret. We analyze the broad class of Weighted Empirical Risk Minimization (WERM) policies which weigh past data according to their similarity in the contextual space. This class includes classical policies such as ERM, k-Nearest Neighbors and kernel-based policies. Our main methodological contribution is to characterize exactly the worst-case regret of any WERM policy on any given configuration of contexts. To the best of our knowledge, this provides the first understanding of tight performance guarantees in any contextual decision-making problem, with past literature focusing on upper bounds via concentration inequalities. We instead take an optimization approach, and isolate a structure in the Newsvendor loss function that allows to reduce the infinite-dimensional optimization problem over worst-case distributions to a simple line search. This in turn allows us to unveil fundamental insights that were obfuscated by previous general-purpose bounds. We characterize actual guaranteed performance as a function of the contexts, as well as granular insights on the learning curve of algorithms.
