Table of Contents
Fetching ...

LISTEN to Your Preferences: An LLM Framework for Multi-Objective Selection

Adam S. Jovine, Tinghan Ye, Francis Bahk, Jingjing Wang, David B. Shmoys, Peter I. Frazier

TL;DR

This work introduces LISTEN, a framework that uses a Large Language Model as a zero-shot preference oracle to guide the selection of a single best item from a large set with multiple objectives, driven by natural-language user priorities. It presents two algorithms: LISTEN-U, which iteratively refines a parametric linear utility over numerical attributes, and LISTEN-T, a non-parametric tournament-based method that does not assume a specific utility form. A novel concordance metric is introduced to predict when LISTEN-U is likely to excel, and the methods are evaluated on flights, headphones, and exam scheduling tasks, showing that LISTEN-U can outperform baselines in high-concordance settings while LISTEN-T offers robust performance across domains. The results suggest that LLM-guided preference elicitation can reduce cognitive load and enable effective multi-objective decision-making, with limitations related to subjectivity and linear-utility assumptions guiding future extensions toward richer utility representations and broader domain validation.

Abstract

Human experts often struggle to select the best option from a large set of items with multiple competing objectives, a process bottlenecked by the difficulty of formalizing complex, implicit preferences. To address this, we introduce LISTEN, a framework that leverages a Large Language Model (LLM) as a zero-shot preference oracle, guided only by an expert's high-level priorities in natural language. To operate within LLM constraints like context windows and inference costs, we propose two iterative algorithms: LISTEN-U, which uses the LLM to refine a parametric utility function, and LISTEN-T, a non-parametric method that performs tournament-style selections over small batches of solutions. Evaluated on diverse tasks including flight booking, shopping, and exam scheduling, our results show LISTEN-U excels when preferences are parametrically aligned (a property we measure with a novel concordance metric), while LISTEN-T offers more robust performance. This work explores a promising direction for steering complex multi-objective decisions directly with natural language, reducing the cognitive burden of traditional preference elicitation.

LISTEN to Your Preferences: An LLM Framework for Multi-Objective Selection

TL;DR

This work introduces LISTEN, a framework that uses a Large Language Model as a zero-shot preference oracle to guide the selection of a single best item from a large set with multiple objectives, driven by natural-language user priorities. It presents two algorithms: LISTEN-U, which iteratively refines a parametric linear utility over numerical attributes, and LISTEN-T, a non-parametric tournament-based method that does not assume a specific utility form. A novel concordance metric is introduced to predict when LISTEN-U is likely to excel, and the methods are evaluated on flights, headphones, and exam scheduling tasks, showing that LISTEN-U can outperform baselines in high-concordance settings while LISTEN-T offers robust performance across domains. The results suggest that LLM-guided preference elicitation can reduce cognitive load and enable effective multi-objective decision-making, with limitations related to subjectivity and linear-utility assumptions guiding future extensions toward richer utility representations and broader domain validation.

Abstract

Human experts often struggle to select the best option from a large set of items with multiple competing objectives, a process bottlenecked by the difficulty of formalizing complex, implicit preferences. To address this, we introduce LISTEN, a framework that leverages a Large Language Model (LLM) as a zero-shot preference oracle, guided only by an expert's high-level priorities in natural language. To operate within LLM constraints like context windows and inference costs, we propose two iterative algorithms: LISTEN-U, which uses the LLM to refine a parametric utility function, and LISTEN-T, a non-parametric method that performs tournament-style selections over small batches of solutions. Evaluated on diverse tasks including flight booking, shopping, and exam scheduling, our results show LISTEN-U excels when preferences are parametrically aligned (a property we measure with a novel concordance metric), while LISTEN-T offers more robust performance. This work explores a promising direction for steering complex multi-objective decisions directly with natural language, reducing the cognitive burden of traditional preference elicitation.

Paper Structure

This paper contains 47 sections, 13 figures, 4 tables, 3 algorithms.

Figures (13)

  • Figure 1: A schematic overview of the LISTEN framework. A human decision maker provides preferences in natural language. An iterative algorithm, either LISTEN-U or LISTEN-T, then uses an LLM as a preference oracle to progressively filter a large set of candidate items (e.g., the Pareto frontier) and identify the single most preferred solution.
  • Figure 2: Performance of LISTEN algorithms and baselines on four datasets, showing the Normalized Average Rank (lower is better) over 25 iterations. The plots show results using the Llama model; similar results for the Gemini model are provided in the appendix. The results for the Flights01 dataset, which are similar to Flights00, are also in the appendix.
  • Figure 3: Performance of LISTEN-U and LISTEN-T on the Headphones-General prompt and the modified Headphones-Strict prompt, which has lower concordance.
  • Figure 4: Ablation study evaluating the impact of preference utterance. Performance using the preference-guided prompt is compared to a base prompt containing only the persona and metric definitions. (Llama results; see Appendix for Gemini results).
  • Figure 5: Performance of LISTEN algorithms and baselines on five datasets, showing the Average Utility Score (higher is better) over 25 iterations. The plots show results using the Llama model.
  • ...and 8 more figures