A Nonparametric Approach with Marginals for Modeling Consumer Choice

Yanqiu Ruan; Xiaobo Li; Karthyek Murthy; Karthik Natarajan

A Nonparametric Approach with Marginals for Modeling Consumer Choice

Yanqiu Ruan, Xiaobo Li, Karthyek Murthy, Karthik Natarajan

TL;DR

This work introduces the Marginal Distribution Model (MDM) as a nonparametric, marginals-only approach to discrete choice, yielding tractable representation and prediction for data across offer sets. It provides a complete necessary-and-sufficient characterization (via a polynomial-size linear program) for when observed data are MDM-representable, proves MDM has positive Lebesgue measure, and clarifies that MDM and RUM do not subsume one another in general. The authors develop robust optimization frameworks to predict sales and revenues for unseen assortments and derive mixed-integer convex programs to handle data inconsistency, along with polynomial-time schemes for structured assortment collections. Empirical results on real (JD.com) and synthetic data show MDM achieves competitive predictive and explanatory performance with substantially faster computation than RUM-based methods, highlighting its practical value for pricing and assortment optimization under limited data. The paper also discusses limitations related to sampling noise and utility correlations, and provides extensive proofs and supplementary materials in the electronic companion.

Abstract

Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, which requires only the specification of marginal distributions of the random utilities. This paper aims to establish necessary and sufficient conditions for given choice data to be consistent with the MDM hypothesis, inspired by the usefulness of similar characterizations for the random utility model (RUM). This endeavor leads to an exact characterization of the set of choice probabilities that the MDM can represent. Verifying the consistency of choice data with this characterization is equivalent to solving a polynomial-sized linear program. Since the analogous verification task for RUM is computationally intractable and neither of these models subsumes the other, MDM is helpful in striking a balance between tractability and representational power. The characterization is then used with robust optimization for making data-driven sales and revenue predictions for new unseen assortments. When the choice data lacks consistency with the MDM hypothesis, finding the best-fitting MDM choice probabilities reduces to solving a mixed integer convex program. Numerical results using real world data and synthetic data demonstrate that MDM exhibits competitive representational power and prediction performance compared to RUM and parametric models while being significantly faster in computation than RUM.

A Nonparametric Approach with Marginals for Modeling Consumer Choice

TL;DR

Abstract

Paper Structure (77 sections, 18 theorems, 89 equations, 10 figures, 35 tables, 1 algorithm)

This paper contains 77 sections, 18 theorems, 89 equations, 10 figures, 35 tables, 1 algorithm.

Introduction
The choice model and the research questions
Contributions
Related literature and a description of MDM
On the characterizations available for choice models
The marginal distribution model and related literature
Related estimation approaches
An exact characterization for MDM and its implications
A tractable characterization for MDM
On the representational power of MDM
A nonparametric approach towards prediction for new assortments
A robust optimization formulation for sales and revenue predictions
A mixed integer linear formulation for the worst case expected revenue
Polynomial-time algorithms for prediction with structured collections
Limit of MDM and the estimation of best-fitting MDM probabilities
...and 62 more sections

Key Result

Lemma 1

natarajan2009persistencymishra2014theoreticalchen2022distributionally Under Assumption asp:general, the choice probabilities for a distribution which attains the maximum in (mdm) is unique and is given by the optimal solution of the following strictly concave maximization problem over the simplex: with the convention that $F_i^{-1}(0)= \lim_{t\downarrow 0} F_i^{-1}(t)$ and $F_i^{-1}(1)= \lim_{t\u

Figures (10)

Figure 1: A summary of main contributions of the paper together with a workflow
Figure 2: An illustration of the construction of the marginal distribution $F_i$ when: (a) there is an assortment $S$ for which $p_{i,S} = 0$ (the case where $l_i < m_i$) and (b) $p_{i,S} > 0$ for all assortments with product $i$ (the case where $l_i = m_i$).
Figure 3: Illustraction of prediction intervals for an unseen assortment under sampling uncertainty
Figure : Distributions (%) of Models' Ranking based on Kendall Tau Distance
Figure EC.2: The representational power of MDM
...and 5 more figures

Theorems & Definitions (40)

Lemma 1
Theorem 1: A tractable characterization for MDM
proof
Theorem 2
Lemma 2
Theorem 3: Relationship between MDM and RUM
Proposition 1: Relationship between MDM, APU and MNL
Proposition 2
Proposition 3
Corollary 1
...and 30 more

A Nonparametric Approach with Marginals for Modeling Consumer Choice

TL;DR

Abstract

A Nonparametric Approach with Marginals for Modeling Consumer Choice

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (40)