Table of Contents
Fetching ...

ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making

Yitong Luo, Hou Hei Lam, Ziang Chen, Zhenliang Zhang, Xue Feng

TL;DR

ValuePilot tackles personalized, value-driven decision-making by introducing a two-phase framework: a dataset generation toolkit (DGT) that constructs value-annotated scenarios and actions via LLMs with automated filtering and human curation, and a decision-making module (DMM) that learns to recognize values and select actions. The DMM comprises a Value Assessment Network, built on a T5 encoder, and an Action Selection Module that combines objective value signals with user-specific preferences through Contextualized Scoring and PROMETHEE-based ranking. Across in-depth experiments, ValuePilot demonstrates strong alignment with human value preferences, outperforming open-source LLMs in value recognition and surpassing state-of-the-art LLMs in replicating human decision sequences and first-choice actions, aided by explicit merit of the PROMETHEE MCDM approach. The framework presents a scalable path toward interpretable, human-aligned AI that can generalize to novel scenarios while remaining sensitive to individual value profiles, albeit with limitations related to synthetic data, cultural variability, and scalability of value dimensions.

Abstract

Despite recent advances in artificial intelligence (AI), it poses challenges to ensure personalized decision-making in tasks that are not considered in training datasets. To address this issue, we propose ValuePilot, a two-phase value-driven decision-making framework comprising a dataset generation toolkit DGT and a decision-making module DMM trained on the generated data. DGT is capable of generating scenarios based on value dimensions and closely mirroring real-world tasks, with automated filtering techniques and human curation to ensure the validity of the dataset. In the generated dataset, DMM learns to recognize the inherent values of scenarios, computes action feasibility and navigates the trade-offs between multiple value dimensions to make personalized decisions. Extensive experiments demonstrate that, given human value preferences, our DMM most closely aligns with human decisions, outperforming Claude-3.5-Sonnet, Gemini-2-flash, Llama-3.1-405b and GPT-4o. This research is a preliminary exploration of value-driven decision-making. We hope it will stimulate interest in value-driven decision-making and personalized decision-making within the community.

ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making

TL;DR

ValuePilot tackles personalized, value-driven decision-making by introducing a two-phase framework: a dataset generation toolkit (DGT) that constructs value-annotated scenarios and actions via LLMs with automated filtering and human curation, and a decision-making module (DMM) that learns to recognize values and select actions. The DMM comprises a Value Assessment Network, built on a T5 encoder, and an Action Selection Module that combines objective value signals with user-specific preferences through Contextualized Scoring and PROMETHEE-based ranking. Across in-depth experiments, ValuePilot demonstrates strong alignment with human value preferences, outperforming open-source LLMs in value recognition and surpassing state-of-the-art LLMs in replicating human decision sequences and first-choice actions, aided by explicit merit of the PROMETHEE MCDM approach. The framework presents a scalable path toward interpretable, human-aligned AI that can generalize to novel scenarios while remaining sensitive to individual value profiles, albeit with limitations related to synthetic data, cultural variability, and scalability of value dimensions.

Abstract

Despite recent advances in artificial intelligence (AI), it poses challenges to ensure personalized decision-making in tasks that are not considered in training datasets. To address this issue, we propose ValuePilot, a two-phase value-driven decision-making framework comprising a dataset generation toolkit DGT and a decision-making module DMM trained on the generated data. DGT is capable of generating scenarios based on value dimensions and closely mirroring real-world tasks, with automated filtering techniques and human curation to ensure the validity of the dataset. In the generated dataset, DMM learns to recognize the inherent values of scenarios, computes action feasibility and navigates the trade-offs between multiple value dimensions to make personalized decisions. Extensive experiments demonstrate that, given human value preferences, our DMM most closely aligns with human decisions, outperforming Claude-3.5-Sonnet, Gemini-2-flash, Llama-3.1-405b and GPT-4o. This research is a preliminary exploration of value-driven decision-making. We hope it will stimulate interest in value-driven decision-making and personalized decision-making within the community.

Paper Structure

This paper contains 45 sections, 16 equations, 6 figures, 6 tables, 3 algorithms.

Figures (6)

  • Figure 1: ValuePilot framework. The ValuePilot framework simulates individual preferences and guides AI decision-making through a two-phase process. a) DGT. A dataset generation toolkit using large language model to generate structured dataset following detailed instructions which includes scenario descriptions, potential actions, and corresponding value dimension scores. b) DMM. It consists of two modules: Value Assessment Network and the Action Choosing Module.The Value Assessment Network processes scene and action representations through an encoder, followed by a multi-head self-attention mechanism and average pooling, finally mapping the embeddings into the objective value scores of scenarios and actions via an MLP. The Action Selection Module receives the rating output from the Value Assessment Network, combined with pre-personalized value preferences. Through a purely mathematical white-box process involving Contextualized Scoring and the PROMETHEE method, it produces the final ranked actions.
  • Figure 2: Comparison between our DMM and LLMs in terms of their decision rankings' similarity to human decisions. DMM achieves the highest scores, outperforming the best baseline (GPT-4o-mini) by +6.53% in OS-Sim and +11.37% in First-Acc.
  • Figure 3: Statistical characteristics of the dataset; (a) and (b) show the number of times that the scores of all six value dimensions in the dataset appeared as positive and negative values in the scenario; (c) shows the word cloud image of the words that appear in the scenerio; (d) and (e) show the number of agents in the scenarios of the dataset; (f) shows the score distribution of all value dimensions.
  • Figure 4: Pairwise Correlations and Distributions of Value Preferences Across the Six Dimensions with Pearson correlation coefficient (r)
  • Figure 5: Radar chart showing the value preferences of all subjects. Each polygon represents a subject, and different colors of outline correspond to different subjects.
  • ...and 1 more figures