Uncovering Utility Functions from Observed Outcomes
Marta Grzeskiewicz
TL;DR
PEARL addresses the challenge of recovering nonobservable utility from observed demand under price endogeneity. It fuses revealed preference theory with inverse reinforcement learning to learn a flexible, parameterized utility function that rationalizes choices within a budget constraint. The method introduces an ICNN based utility with concavity and monotonicity guarantees and employs a two stage learning process to enforce GARP consistency while enabling counterfactual demand predictions. Through simulations with both noise free and noisy data, PEARL demonstrates accurate parameter recovery, reliable demand forecasts, and competitive elasticity estimation, outperforming standard baselines. This approach enables policy analysis and welfare estimation by isolating price effects from demand shocks in a scalable, theory grounded framework.
Abstract
Determining consumer preferences and utility is a foundational challenge in economics. They are central in determining consumer behaviour through the utility-maximising consumer decision-making process. However, preferences and utilities are not observable and may not even be known to the individual making the choice; only the outcome is observed in the form of demand. Without the ability to observe the decision-making mechanism, demand estimation becomes a challenging task and current methods fall short due to lack of scalability or ability to identify causal effects. Estimating these effects is critical when considering changes in policy, such as pricing, the impact of taxes and subsidies, and the effect of a tariff. To address the shortcomings of existing methods, we combine revealed preference theory and inverse reinforcement learning to present a novel algorithm, Preference Extraction and Reward Learning (PEARL) which, to the best of our knowledge, is the only algorithm that can uncover a representation of the utility function that best rationalises observed consumer choice data given a specified functional form. We introduce a flexible utility function, the Input-Concave Neural Network which captures complex relationships across goods, including cross-price elasticities. Results show PEARL outperforms the benchmark on both noise-free and noisy synthetic data.
