Experimental Assortments for Choice Estimation and Nest Identification

Xintong Yu; Will Ma; Michael Zhao

Experimental Assortments for Choice Estimation and Nest Identification

Xintong Yu, Will Ma, Michael Zhao

Abstract

What assortments (subsets of items) should be offered, to collect data for estimating a choice model over $n$ total items? We propose a structured, non-adaptive experiment design requiring only $O(\log n)$ distinct assortments, each offered repeatedly, that consistently outperforms randomized and other heuristic designs across an extensive numerical benchmark that estimates multiple different choice models under a variety of (possibly mis-specified) ground truths. We then focus on Nested Logit choice models, which cluster items into "nests" of close substitutes. Whereas existing Nested Logit estimation procedures assume the nests to be known and fixed, we present a new algorithm to identify nests based on collected data, which when used in conjunction with our experiment design, guarantees correct identification of nests under any Nested Logit ground truth. Our experiment design was deployed to collect data from over 70 million users at Dream11, an Indian fantasy sports platform that offers different types of betting contests, with rich substitution patterns between them. We identify nests based on the collected data, which lead to better out-of-sample choice prediction than ex-ante clustering from contest features. Our identified nests are ex-post justifiable to Dream11 management.

Experimental Assortments for Choice Estimation and Nest Identification

Abstract

What assortments (subsets of items) should be offered, to collect data for estimating a choice model over

total items? We propose a structured, non-adaptive experiment design requiring only

distinct assortments, each offered repeatedly, that consistently outperforms randomized and other heuristic designs across an extensive numerical benchmark that estimates multiple different choice models under a variety of (possibly mis-specified) ground truths. We then focus on Nested Logit choice models, which cluster items into "nests" of close substitutes. Whereas existing Nested Logit estimation procedures assume the nests to be known and fixed, we present a new algorithm to identify nests based on collected data, which when used in conjunction with our experiment design, guarantees correct identification of nests under any Nested Logit ground truth. Our experiment design was deployed to collect data from over 70 million users at Dream11, an Indian fantasy sports platform that offers different types of betting contests, with rich substitution patterns between them. We identify nests based on the collected data, which lead to better out-of-sample choice prediction than ex-ante clustering from contest features. Our identified nests are ex-post justifiable to Dream11 management.

Paper Structure (77 sections, 13 theorems, 85 equations, 14 figures, 6 tables, 1 algorithm)

This paper contains 77 sections, 13 theorems, 85 equations, 14 figures, 6 tables, 1 algorithm.

Introduction
Experiment design problem.
Nest identification problem.
Our Experiment Design
Theoretical Results for Nest Identification
Numerical Results
Comparing experiment designs.
Comparing nest identification pipelines.
Deployment at Dream11
Roadmap.
Related Work
Choice Estimation from Real Data
Choice Estimation from Randomized Assortments
Simple Experiment Designs for Estimation of Specific Choice Models
Learning while Earning in Assortment Optimization
...and 62 more sections

Key Result

proposition 1

Suppose ass:identify and ass:genPos hold and take any $S\in\mathcal{S}$. For all $i\in S$: Therefore, we can make the following deductions about nest membership.

Figures (14)

Figure 1: Example deductions after Experiment 1 in \ref{['table:introDesign']}. "Small" boost factors were observed for $\xspace$ and $\xspace$, while "large" and distinct boost factors were observed for $\xspace$ and $\xspace$. By deduction I., we know $\xspace$ and $\xspace$ are not in the same nest as any of the 6 other items, but we do not know whether $\xspace$ and $\xspace$ are in the same nest. By deduction II., we know $\xspace$ and $\xspace$ are also not in the same nest. The table does not record the latent information that both $\xspace$ and $\xspace$ are in the same nest as at least one item outside $S$, even though this information is necessary for our nest identification. The complete identification of nests is found in \ref{['sssec:algIllustration']}.
Figure 2: Evolution of the adjacency matrix $E$ during nest identification. White squares indicate $E[i,j]=1$ (same nest); black squares indicate $E[i,j]=0$ (different nests); while grey squares indicate $E[i,j]=\texttt{null}$ (not yet determined). The state of the adjacency matrix $E$ is displayed after processing each of the 6 experimental assortments $S\in\mathcal{S}$, and after the "One Hop Transitivity" (line \ref{['alg1:transitivity']}) and "Identify Missing Pairs" (line \ref{['alg1:uniqueNest']}) operations.
Figure 3: Comparing experiment designs in a mis-specified setting
Figure 4: Comparing experiment designs in a well-specified setting, where we display average $\mathrm{RMSE}^\mathrm{soft}$ over the 500 Markov Chain ground truths under Markov Chain choice estimation
Figure 5: Comparing experiment designs and nest identification algorithms, averaged over the 500 Nested Logit ground truths with an outside option
...and 9 more figures

Theorems & Definitions (24)

definition 1: Nest Identification Problem, noiseless version
definition 2: Boost Factors
proposition 1: Nest Deductions with Outside Option
theorem 1: proven in Section \ref{['sec:mainResultPf']}
theorem 2: proven in \ref{['sec:sampleComplexityPf']}
proposition 2: Nest Deductions without Outside Option
theorem 3: proven in \ref{['sec:idenWithoutOutsidePf']}
lemma 1: Injectivity of the encoding
proof
theorem 4
...and 14 more

Experimental Assortments for Choice Estimation and Nest Identification

Abstract

Experimental Assortments for Choice Estimation and Nest Identification

Authors

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (24)