Table of Contents
Fetching ...

A Nonparametric Approach with Marginals for Modeling Consumer Choice

Yanqiu Ruan, Xiaobo Li, Karthyek Murthy, Karthik Natarajan

TL;DR

This work introduces the Marginal Distribution Model (MDM) as a nonparametric, marginals-only approach to discrete choice, yielding tractable representation and prediction for data across offer sets. It provides a complete necessary-and-sufficient characterization (via a polynomial-size linear program) for when observed data are MDM-representable, proves MDM has positive Lebesgue measure, and clarifies that MDM and RUM do not subsume one another in general. The authors develop robust optimization frameworks to predict sales and revenues for unseen assortments and derive mixed-integer convex programs to handle data inconsistency, along with polynomial-time schemes for structured assortment collections. Empirical results on real (JD.com) and synthetic data show MDM achieves competitive predictive and explanatory performance with substantially faster computation than RUM-based methods, highlighting its practical value for pricing and assortment optimization under limited data. The paper also discusses limitations related to sampling noise and utility correlations, and provides extensive proofs and supplementary materials in the electronic companion.

Abstract

Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, which requires only the specification of marginal distributions of the random utilities. This paper aims to establish necessary and sufficient conditions for given choice data to be consistent with the MDM hypothesis, inspired by the usefulness of similar characterizations for the random utility model (RUM). This endeavor leads to an exact characterization of the set of choice probabilities that the MDM can represent. Verifying the consistency of choice data with this characterization is equivalent to solving a polynomial-sized linear program. Since the analogous verification task for RUM is computationally intractable and neither of these models subsumes the other, MDM is helpful in striking a balance between tractability and representational power. The characterization is then used with robust optimization for making data-driven sales and revenue predictions for new unseen assortments. When the choice data lacks consistency with the MDM hypothesis, finding the best-fitting MDM choice probabilities reduces to solving a mixed integer convex program. Numerical results using real world data and synthetic data demonstrate that MDM exhibits competitive representational power and prediction performance compared to RUM and parametric models while being significantly faster in computation than RUM.

A Nonparametric Approach with Marginals for Modeling Consumer Choice

TL;DR

This work introduces the Marginal Distribution Model (MDM) as a nonparametric, marginals-only approach to discrete choice, yielding tractable representation and prediction for data across offer sets. It provides a complete necessary-and-sufficient characterization (via a polynomial-size linear program) for when observed data are MDM-representable, proves MDM has positive Lebesgue measure, and clarifies that MDM and RUM do not subsume one another in general. The authors develop robust optimization frameworks to predict sales and revenues for unseen assortments and derive mixed-integer convex programs to handle data inconsistency, along with polynomial-time schemes for structured assortment collections. Empirical results on real (JD.com) and synthetic data show MDM achieves competitive predictive and explanatory performance with substantially faster computation than RUM-based methods, highlighting its practical value for pricing and assortment optimization under limited data. The paper also discusses limitations related to sampling noise and utility correlations, and provides extensive proofs and supplementary materials in the electronic companion.

Abstract

Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, which requires only the specification of marginal distributions of the random utilities. This paper aims to establish necessary and sufficient conditions for given choice data to be consistent with the MDM hypothesis, inspired by the usefulness of similar characterizations for the random utility model (RUM). This endeavor leads to an exact characterization of the set of choice probabilities that the MDM can represent. Verifying the consistency of choice data with this characterization is equivalent to solving a polynomial-sized linear program. Since the analogous verification task for RUM is computationally intractable and neither of these models subsumes the other, MDM is helpful in striking a balance between tractability and representational power. The characterization is then used with robust optimization for making data-driven sales and revenue predictions for new unseen assortments. When the choice data lacks consistency with the MDM hypothesis, finding the best-fitting MDM choice probabilities reduces to solving a mixed integer convex program. Numerical results using real world data and synthetic data demonstrate that MDM exhibits competitive representational power and prediction performance compared to RUM and parametric models while being significantly faster in computation than RUM.
Paper Structure (77 sections, 18 theorems, 89 equations, 10 figures, 35 tables, 1 algorithm)

This paper contains 77 sections, 18 theorems, 89 equations, 10 figures, 35 tables, 1 algorithm.

Key Result

Lemma 1

natarajan2009persistencymishra2014theoreticalchen2022distributionally Under Assumption asp:general, the choice probabilities for a distribution which attains the maximum in (mdm) is unique and is given by the optimal solution of the following strictly concave maximization problem over the simplex: with the convention that $F_i^{-1}(0)= \lim_{t\downarrow 0} F_i^{-1}(t)$ and $F_i^{-1}(1)= \lim_{t\u

Figures (10)

  • Figure 1: A summary of main contributions of the paper together with a workflow
  • Figure 2: An illustration of the construction of the marginal distribution $F_i$ when: (a) there is an assortment $S$ for which $p_{i,S} = 0$ (the case where $l_i < m_i$) and (b) $p_{i,S} > 0$ for all assortments with product $i$ (the case where $l_i = m_i$).
  • Figure 3: Illustraction of prediction intervals for an unseen assortment under sampling uncertainty
  • Figure : Distributions (%) of Models' Ranking based on Kendall Tau Distance
  • Figure EC.2: The representational power of MDM
  • ...and 5 more figures

Theorems & Definitions (40)

  • Lemma 1
  • Theorem 1: A tractable characterization for MDM
  • proof
  • Theorem 2
  • Lemma 2
  • Theorem 3: Relationship between MDM and RUM
  • Proposition 1: Relationship between MDM, APU and MNL
  • Proposition 2
  • Proposition 3
  • Corollary 1
  • ...and 30 more