Table of Contents
Fetching ...

Uncovering Utility Functions from Observed Outcomes

Marta Grzeskiewicz

TL;DR

PEARL addresses the challenge of recovering nonobservable utility from observed demand under price endogeneity. It fuses revealed preference theory with inverse reinforcement learning to learn a flexible, parameterized utility function that rationalizes choices within a budget constraint. The method introduces an ICNN based utility with concavity and monotonicity guarantees and employs a two stage learning process to enforce GARP consistency while enabling counterfactual demand predictions. Through simulations with both noise free and noisy data, PEARL demonstrates accurate parameter recovery, reliable demand forecasts, and competitive elasticity estimation, outperforming standard baselines. This approach enables policy analysis and welfare estimation by isolating price effects from demand shocks in a scalable, theory grounded framework.

Abstract

Determining consumer preferences and utility is a foundational challenge in economics. They are central in determining consumer behaviour through the utility-maximising consumer decision-making process. However, preferences and utilities are not observable and may not even be known to the individual making the choice; only the outcome is observed in the form of demand. Without the ability to observe the decision-making mechanism, demand estimation becomes a challenging task and current methods fall short due to lack of scalability or ability to identify causal effects. Estimating these effects is critical when considering changes in policy, such as pricing, the impact of taxes and subsidies, and the effect of a tariff. To address the shortcomings of existing methods, we combine revealed preference theory and inverse reinforcement learning to present a novel algorithm, Preference Extraction and Reward Learning (PEARL) which, to the best of our knowledge, is the only algorithm that can uncover a representation of the utility function that best rationalises observed consumer choice data given a specified functional form. We introduce a flexible utility function, the Input-Concave Neural Network which captures complex relationships across goods, including cross-price elasticities. Results show PEARL outperforms the benchmark on both noise-free and noisy synthetic data.

Uncovering Utility Functions from Observed Outcomes

TL;DR

PEARL addresses the challenge of recovering nonobservable utility from observed demand under price endogeneity. It fuses revealed preference theory with inverse reinforcement learning to learn a flexible, parameterized utility function that rationalizes choices within a budget constraint. The method introduces an ICNN based utility with concavity and monotonicity guarantees and employs a two stage learning process to enforce GARP consistency while enabling counterfactual demand predictions. Through simulations with both noise free and noisy data, PEARL demonstrates accurate parameter recovery, reliable demand forecasts, and competitive elasticity estimation, outperforming standard baselines. This approach enables policy analysis and welfare estimation by isolating price effects from demand shocks in a scalable, theory grounded framework.

Abstract

Determining consumer preferences and utility is a foundational challenge in economics. They are central in determining consumer behaviour through the utility-maximising consumer decision-making process. However, preferences and utilities are not observable and may not even be known to the individual making the choice; only the outcome is observed in the form of demand. Without the ability to observe the decision-making mechanism, demand estimation becomes a challenging task and current methods fall short due to lack of scalability or ability to identify causal effects. Estimating these effects is critical when considering changes in policy, such as pricing, the impact of taxes and subsidies, and the effect of a tariff. To address the shortcomings of existing methods, we combine revealed preference theory and inverse reinforcement learning to present a novel algorithm, Preference Extraction and Reward Learning (PEARL) which, to the best of our knowledge, is the only algorithm that can uncover a representation of the utility function that best rationalises observed consumer choice data given a specified functional form. We introduce a flexible utility function, the Input-Concave Neural Network which captures complex relationships across goods, including cross-price elasticities. Results show PEARL outperforms the benchmark on both noise-free and noisy synthetic data.

Paper Structure

This paper contains 27 sections, 34 equations, 4 figures, 4 tables, 3 algorithms.

Figures (4)

  • Figure 1: Loss (top) and gradient (bottom) computed at varying initial values of $\hat{\theta}_1$, with true value at $\theta_1 = 0.4$ (dashed, grey). The loss is minimised and gradients are zero at the true value. Since the gradients cross the $x$-axis only once, the parameter is guaranteed to converge approximately at the true value with PEARL.
  • Figure 2: Comparison of contours of fitted ICNN utility function (solid, blue lines) by PEARL and the ground truth CD utility function (black, dashed lines) with $\boldsymbol{\theta} = [0.4, 0.6]$. Observations shown in grey with $N=160$, $k=2$.
  • Figure 3: Own and cross-price elasticities numerically estimated after CD (left) and ICNN (right) utility functions are trained by PEARL on $N=1600, k=5$.
  • Figure 4: Illustration of comparison of demand functions for (a) $x^1$ and (b) $x^2$ between ground truth (black, dashed) and the demand obtained from maximising the fitted ICNN (blue). Prices for all the other goods have been fixed at $5$ and income at $20$. Observations are generated for prices between $1$ and $10$, as indicated by the grey bars.

Theorems & Definitions (2)

  • Definition 1: GARP
  • Definition 2