Table of Contents
Fetching ...

Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices

Manon Reusens, Sofie Goethals, Toon Calders, David Martens

TL;DR

This work introduces an economic framework to study LLM-driven subjective decisions in travel contexts by eliciting implied willingness to pay (WTP) via a multinomial logit model. It systematically evaluates how prompt design and user representations—via in-context learning and persona prompting—shape LLM valuations of hotel attributes, comparing them to human benchmarks. Across baseline and realistic settings, larger models show more coherent preferences but often overestimate WTP, especially for premium attributes like club access, while conditioning on cheaper-option priors brings valuations closer to human norms. The findings underscore both the potential of LLMs for decision-support and the need for careful model selection, prompt design, and user modeling to avoid misalignment and strategic framing effects in practice.

Abstract

As Large Language Models (LLMs) are increasingly deployed in applications such as travel assistance and purchasing support, they are often required to make subjective choices on behalf of users in settings where no objectively correct answer exists. We study LLM decision-making in a travel-assistant context by presenting models with choice dilemmas and analyzing their responses using multinomial logit models to derive implied willingness to pay (WTP) estimates. These WTP values are subsequently compared to human benchmark values from the economics literature. In addition to a baseline setting, we examine how model behavior changes under more realistic conditions, including the provision of information about users' past choices and persona-based prompting. Our results show that while meaningful WTP values can be derived for larger LLMs, they also display systematic deviations at the attribute level. Additionally, they tend to overestimate human WTP overall, particularly when expensive options or business-oriented personas are introduced. Conditioning models on prior preferences for cheaper options yields valuations that are closer to human benchmarks. Overall, our findings highlight both the potential and the limitations of using LLMs for subjective decision support and underscore the importance of careful model selection, prompt design, and user representation when deploying such systems in practice.

Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices

TL;DR

This work introduces an economic framework to study LLM-driven subjective decisions in travel contexts by eliciting implied willingness to pay (WTP) via a multinomial logit model. It systematically evaluates how prompt design and user representations—via in-context learning and persona prompting—shape LLM valuations of hotel attributes, comparing them to human benchmarks. Across baseline and realistic settings, larger models show more coherent preferences but often overestimate WTP, especially for premium attributes like club access, while conditioning on cheaper-option priors brings valuations closer to human norms. The findings underscore both the potential of LLMs for decision-support and the need for careful model selection, prompt design, and user modeling to avoid misalignment and strategic framing effects in practice.

Abstract

As Large Language Models (LLMs) are increasingly deployed in applications such as travel assistance and purchasing support, they are often required to make subjective choices on behalf of users in settings where no objectively correct answer exists. We study LLM decision-making in a travel-assistant context by presenting models with choice dilemmas and analyzing their responses using multinomial logit models to derive implied willingness to pay (WTP) estimates. These WTP values are subsequently compared to human benchmark values from the economics literature. In addition to a baseline setting, we examine how model behavior changes under more realistic conditions, including the provision of information about users' past choices and persona-based prompting. Our results show that while meaningful WTP values can be derived for larger LLMs, they also display systematic deviations at the attribute level. Additionally, they tend to overestimate human WTP overall, particularly when expensive options or business-oriented personas are introduced. Conditioning models on prior preferences for cheaper options yields valuations that are closer to human benchmarks. Overall, our findings highlight both the potential and the limitations of using LLMs for subjective decision support and underscore the importance of careful model selection, prompt design, and user representation when deploying such systems in practice.
Paper Structure (27 sections, 4 equations, 9 figures, 10 tables)

This paper contains 27 sections, 4 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Methodology for LLM-based discrete choice simulation and WTP estimation.
  • Figure 2: This heatmap shows the average deviation in WTP over the different attributes for the different in-context learning experiments for all models.
  • Figure 3: Effect on the WTP values of adding 3 in-context learning examples to the prompts for the different models.
  • Figure 4: Median absolute deviation compared to human baseline per attribute and model for the different in-context learning experiments
  • Figure 5: This figure shows the relative increase between the WTP for an attribute when no information was given and when the business persona was assigned.
  • ...and 4 more figures