Table of Contents
Fetching ...

ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs

Hua Shen, Tiffany Knearem, Reshmi Ghosh, Yu-Ju Yang, Nicholas Clark, Tanushree Mitra, Yun Huang

TL;DR

ValueCompass addresses how to quantify and improve alignment between humans and LLMs across real-world contexts. It integrates Schwartz's Theory of Basic Values into a three-part framework: a contextual value alignment instrument (Value Form), robust prompting strategies, and quantitative alignment metrics (Alignment Rate, Alignment Distance, Alignment Ranking). The framework is applied to four scenarios and five LLMs with 112 human participants across seven countries, revealing widespread misalignments (e.g., humans favor National Security values that LLMs reject) and clear context effects, with the best F1 reaching $0.529$. The authors argue for context-aware, human-in-the-loop alignment strategies and demonstrate ValueCompass as a practical diagnostic tool to guide responsible AI design and governance.

Abstract

As AI systems become more advanced, ensuring their alignment with a diverse range of individuals and societal values becomes increasingly critical. But how can we capture fundamental human values and assess the degree to which AI systems align with them? We introduce ValueCompass, a framework of fundamental values, grounded in psychological theory and a systematic review, to identify and evaluate human-AI alignment. We apply ValueCompass to measure the value alignment of humans and large language models (LLMs) across four real-world scenarios: collaborative writing, education, public sectors, and healthcare. Our findings reveal concerning misalignments between humans and LLMs, such as humans frequently endorse values like "National Security" which were largely rejected by LLMs. We also observe that values differ across scenarios, highlighting the need for context-aware AI alignment strategies. This work provides valuable insights into the design space of human-AI alignment, laying the foundations for developing AI systems that responsibly reflect societal values and ethics.

ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs

TL;DR

ValueCompass addresses how to quantify and improve alignment between humans and LLMs across real-world contexts. It integrates Schwartz's Theory of Basic Values into a three-part framework: a contextual value alignment instrument (Value Form), robust prompting strategies, and quantitative alignment metrics (Alignment Rate, Alignment Distance, Alignment Ranking). The framework is applied to four scenarios and five LLMs with 112 human participants across seven countries, revealing widespread misalignments (e.g., humans favor National Security values that LLMs reject) and clear context effects, with the best F1 reaching . The authors argue for context-aware, human-in-the-loop alignment strategies and demonstrate ValueCompass as a practical diagnostic tool to guide responsible AI design and governance.

Abstract

As AI systems become more advanced, ensuring their alignment with a diverse range of individuals and societal values becomes increasingly critical. But how can we capture fundamental human values and assess the degree to which AI systems align with them? We introduce ValueCompass, a framework of fundamental values, grounded in psychological theory and a systematic review, to identify and evaluate human-AI alignment. We apply ValueCompass to measure the value alignment of humans and large language models (LLMs) across four real-world scenarios: collaborative writing, education, public sectors, and healthcare. Our findings reveal concerning misalignments between humans and LLMs, such as humans frequently endorse values like "National Security" which were largely rejected by LLMs. We also observe that values differ across scenarios, highlighting the need for context-aware AI alignment strategies. This work provides valuable insights into the design space of human-AI alignment, laying the foundations for developing AI systems that responsibly reflect societal values and ethics.
Paper Structure (17 sections, 3 equations, 12 figures, 3 tables)

This paper contains 17 sections, 3 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: (A) An overview of the ValueCompass framework for systematically measuring value alignment between LLMs and humans across contextual scenarios. (B) Evaluation with four representative scenarios in this study, with the framework extendable to additional values and scenarios.
  • Figure 2: The Value Responses from humans responses (A) and Deepseek-r1 generations (B); as well as the Alignment Distance between them (C).
  • Figure 3: Value Form is a context-aware instrument to measure the value alignment between humans and LLMs. It includes a task introduction, a vignette, and 56 value statements, grounded in Schwartz Theory of Basic Values. As shown in Figure \ref{['fig:framework']}, humans and LLMs rate each value on a scale from "-2: Strongly Disagree" to "2: Strongly Agree", plus "Irrelevant." The form aims to assess human-AI value alignment contextualized in various scenarios.
  • Figure 4: Four vignettes, designed to contextualize the value statements in the ValueCompass framework, are organized by increasing risk and reflect real-world tasks: collaborative writing, education, the public sector, and healthcare. Images are included in the vignettes to aid respondents in understanding the context.
  • Figure 5: Deepseek-r1 Model's Heatmaps of Values in (A) Human Response, (B) LLM Generations, and (C) Alignment Value Distance across 4 social topics.
  • ...and 7 more figures