Demand Estimation with Text and Image Data
Giovanni Compiani, Ilya Morozov, Stephan Seiler
TL;DR
This paper develops a demand estimation framework that leverages unstructured text and image data to recover substitution patterns, addressing the common problem of unobserved or hard-to-quantify attributes. By extracting embeddings from images and textual content, reducing dimensionality with PCA, and embedding the components as random-coefficient shifters in a mixed logit model, the approach improves counterfactual predictions of substitution and pricing outcomes. Validations come from a controlled experiment measuring second-choice diversions and a large-scale Amazon category analysis, both showing substantial gains over standard attribute-based methods. The work has practical implications for mergers, pricing, and welfare analysis, and it provides a publicly available DeepLogit toolkit to facilitate adoption across markets and categories.
Abstract
We propose a demand estimation method that leverages unstructured text and image data to infer substitution patterns. Using pre-trained deep learning models, we extract embeddings from product images and textual descriptions and incorporate them into a random coefficients logit model. This approach enables researchers to estimate demand even when they lack data on product attributes or when consumers value hard-to-quantify attributes, such as visual design or functional benefits. Using data from a choice experiment, we show that our approach outperforms standard attribute-based models in counterfactual predictions of consumers' second choices. We also apply it across 40 product categories on Amazon and consistently find that text and image data help identify close substitutes within each category.
