A Conformal Approach to Feature-based Newsvendor under Model Misspecification
Junyu Cao
TL;DR
This work tackles model misspecification in feature-based newsvendor problems by introducing a model-free, distribution-free conformalized quantile framework that separates training from calibration. The Contextual Quantile Prediction with Calibration (CQPC) procedure yields a context-dependent quantile with unconditional and conditional guarantees, and it provides a confidence interval whose width shrinks as data quality and quantity improve. A central insight is the data-quality versus data-quantity trade-off, operationalized through pooling strategies, with three data-driven approaches (including GTLC and function-estimation-based pooling) to select the pooling region. Empirical results on simulations and the Capital Bikeshare dataset show substantial reductions in newsvendor loss—up to $38.6\%$ in simulations and $47.3\%$ in real data—demonstrating robust performance gains across multiple quantile-regression models. The framework offers practical guidance for robust decision-making under misspecification and opens avenues for temporal, high-dimensional, and multi-quantile extensions.
Abstract
In many data-driven decision-making problems, performance guarantees often depend heavily on the correctness of model assumptions, which may frequently fail in practice. We address this issue in the context of a feature-based newsvendor problem, where demand is influenced by observed features such as demographics and seasonality. To mitigate the impact of model misspecification, we propose a model-free and distribution-free framework inspired by conformal prediction. Our approach consists of two phases: a training phase, which can utilize any type of prediction method, and a calibration phase that conformalizes the model bias. To enhance predictive performance, we explore the balance between data quality and quantity, recognizing the inherent trade-off: more selective training data improves quality but reduces quantity. Importantly, we provide statistical guarantees for the conformalized critical quantile, independent of the correctness of the underlying model. Moreover, we quantify the confidence interval of the critical quantile, with its width decreasing as data quality and quantity improve. We validate our framework using both simulated data and a real-world dataset from the Capital Bikeshare program in Washington, D.C. Across these experiments, our proposed method consistently outperforms benchmark algorithms, reducing newsvendor loss by up to 40% on the simulated data and 25% on the real-world dataset.
