Calibrating an Imperfect Auxiliary Predictor for Unobserved No-Purchase Choice

Jiangkai Xiong; Kalyan Talluri; Hanzhao Wang

Calibrating an Imperfect Auxiliary Predictor for Unobserved No-Purchase Choice

Jiangkai Xiong, Kalyan Talluri, Hanzhao Wang

TL;DR

This paper addresses the difficulty of estimating unobserved no-purchase probabilities when only purchase data are observed. It introduces two calibration strategies for a biased outside-option predictor: an affine logit-space regression (linear calibration) and a nearly-monotone maximum rank correlation (MRC) approach, each with finite-sample guarantees that separate predictor quality from utility-learning error. A key structural identity expresses the outside-option log-odds as the difference between outside utility and the inclusive value of offered products, enabling calibration without observing no-purchase events. The framework supports multiple predictors and robust aggregation, and the experiments (synthetic and Expedia real data) show substantial improvements in outside-option estimation and downstream assortment revenue, particularly when predictor bias is nonlinear. The work provides practical tools for plug-in calibration in settings with censored outside-option data, with clear implications for market-sizing and decision quality in retail and online platforms.

Abstract

Firms typically cannot observe key consumer actions: whether customers buy from a competitor, choose not to buy, or even fully consider the firm's offer. This missing outside-option information makes market-size and preference estimation difficult even in simple multinomial logit (MNL) models, and it is a central obstacle in practice when only transaction data are recorded. Existing approaches often rely on auxiliary market-share, aggregated, or cross-market data. We study a complementary setting in which a black-box auxiliary predictor provides outside-option probabilities, but is potentially biased or miscalibrated because it was trained in a different channel, period, or population, or produced by an external machine-learning system. We develop calibration methods that turn such imperfect predictions into statistically valid no-purchase estimates using purchase-only data from the focal environment. First, under affine miscalibration in logit space, we show that a simple regression identifies outside-option utility parameters and yields consistent recovery of no-purchase probabilities without collecting new labels for no-purchase events. Second, under a weaker nearly monotone condition, we propose a rank-based calibration method and derive finite-sample error bounds that cleanly separate auxiliary-predictor quality from first-stage utility-learning error over observed in-set choices. Our analysis also translates estimation error into downstream decision quality for assortment optimization, quantifying how calibration accuracy affects revenue performance. The bounds provide explicit dependence on predictor alignment and utility-learning error, clarifying when each source dominates. Numerical experiments demonstrate improvements in no-purchase estimation and downstream assortment decisions, and we discuss robust aggregation extensions for combining multiple auxiliary predictors.

Calibrating an Imperfect Auxiliary Predictor for Unobserved No-Purchase Choice

TL;DR

Abstract

Paper Structure (112 sections, 29 theorems, 275 equations, 15 figures, 4 tables, 3 algorithms)

This paper contains 112 sections, 29 theorems, 275 equations, 15 figures, 4 tables, 3 algorithms.

Introduction
When does our calibration approach work?
Contributions.
Related Literature
Synthetic Data for Operations Management
Unobservable Choice Prediction and Calibration
Problem Setup
Choice model.
Missing data and predictor.
Key Structural Insight for Bias Correction
Linearly Biased Predictor
Linear Regression Calibration
Oracle and practical implementations.
Learning $\{u_i(X)\}$ without outside-option labels.
Analysis of the estimation errors
...and 97 more sections

Key Result

Lemma 1

For any $X$ and any $\mathcal{S}\subseteq \mathcal{I}$,

Figures (15)

Figure 1: Exp 1, Convergence of Linear Calibration (Alg \ref{['alg:linear-calib']}).
Figure 2: Exp 1, Convergence of MRC Calibration (Alg \ref{['alg:mrc-calib']}).
Figure 3: Exp 2, Linear Alg vs. Utility Estimation Error $\sqrt{\bar{\tau}}$.
Figure 4: Exp 2, MRC Alg vs. Utility Estimation Error $\tau_s$.
Figure 5: Exp 3, Linear Alg vs. Predictor Noise $\sigma_{\epsilon}$.
...and 10 more figures

Theorems & Definitions (56)

Lemma 1
Theorem 1: Identification and consistency
Theorem 2: Finite-sample calibration error
Corollary 1: No-purchase probability error
Theorem 3: Population approximate optimality
Proposition 1: Local margin from design
Proposition 2: Plug-in stability
Theorem 4: Finite-sample estimation error
Corollary 2: No-purchase probability error
Proposition 3: Perturbation $\Rightarrow$ Assumption \ref{['as:mnbs-formal']}
...and 46 more

Calibrating an Imperfect Auxiliary Predictor for Unobserved No-Purchase Choice

TL;DR

Abstract

Calibrating an Imperfect Auxiliary Predictor for Unobserved No-Purchase Choice

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (56)