Table of Contents
Fetching ...

Demand Estimation with Text and Image Data

Giovanni Compiani, Ilya Morozov, Stephan Seiler

TL;DR

This paper develops a demand estimation framework that leverages unstructured text and image data to recover substitution patterns, addressing the common problem of unobserved or hard-to-quantify attributes. By extracting embeddings from images and textual content, reducing dimensionality with PCA, and embedding the components as random-coefficient shifters in a mixed logit model, the approach improves counterfactual predictions of substitution and pricing outcomes. Validations come from a controlled experiment measuring second-choice diversions and a large-scale Amazon category analysis, both showing substantial gains over standard attribute-based methods. The work has practical implications for mergers, pricing, and welfare analysis, and it provides a publicly available DeepLogit toolkit to facilitate adoption across markets and categories.

Abstract

We propose a demand estimation method that leverages unstructured text and image data to infer substitution patterns. Using pre-trained deep learning models, we extract embeddings from product images and textual descriptions and incorporate them into a random coefficients logit model. This approach enables researchers to estimate demand even when they lack data on product attributes or when consumers value hard-to-quantify attributes, such as visual design or functional benefits. Using data from a choice experiment, we show that our approach outperforms standard attribute-based models in counterfactual predictions of consumers' second choices. We also apply it across 40 product categories on Amazon and consistently find that text and image data help identify close substitutes within each category.

Demand Estimation with Text and Image Data

TL;DR

This paper develops a demand estimation framework that leverages unstructured text and image data to recover substitution patterns, addressing the common problem of unobserved or hard-to-quantify attributes. By extracting embeddings from images and textual content, reducing dimensionality with PCA, and embedding the components as random-coefficient shifters in a mixed logit model, the approach improves counterfactual predictions of substitution and pricing outcomes. Validations come from a controlled experiment measuring second-choice diversions and a large-scale Amazon category analysis, both showing substantial gains over standard attribute-based methods. The work has practical implications for mergers, pricing, and welfare analysis, and it provides a publicly available DeepLogit toolkit to facilitate adoption across markets and categories.

Abstract

We propose a demand estimation method that leverages unstructured text and image data to infer substitution patterns. Using pre-trained deep learning models, we extract embeddings from product images and textual descriptions and incorporate them into a random coefficients logit model. This approach enables researchers to estimate demand even when they lack data on product attributes or when consumers value hard-to-quantify attributes, such as visual design or functional benefits. Using data from a choice experiment, we show that our approach outperforms standard attribute-based models in counterfactual predictions of consumers' second choices. We also apply it across 40 product categories on Amazon and consistently find that text and image data help identify close substitutes within each category.

Paper Structure

This paper contains 24 sections, 6 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: Example of a choice task in our experiment. The screenshot displays the top portion of the page as it appeared to participants.
  • Figure 2: Comparison of models in terms of counterfactual $RMSE$ on second choices. The two benchmarks in the first panel are the plain logit without random coefficients and the best-fitting mixed logit with random coefficients on observed attributes. The remaining specifications correspond to mixed logit models with random coefficients on principal components extracted from image or text embeddings.
  • Figure 3: Ten books used in our experiment.
  • Figure 4: Book locations in the space of selected principal components (Review USE model).
  • Figure 5: Results of hypothetical merger simulations. For each simulated merger of Dopamine Detox with one other book (horizontal axis), the figure shows the average price increase among the two merging books. The horizontal dashed line represents a hypothetical policy where the decision-maker challenges all mergers expected to raise prices by at least 5%. Appendix Table \ref{['tab:merger-simulation-results']} reports the exact price increase estimates used to construct this graph.
  • ...and 8 more figures