Table of Contents
Fetching ...

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

Taylor Sorensen, Liwei Jiang, Jena Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, Yejin Choi

TL;DR

This work tackles value pluralism in AI by introducing ValuePrism, a large-scale dataset of 218k contextualized values, rights, and duties drawn from 31k real-world situations, and Kaleido, a multi-task language model that generates, explains, and assesses the relevance and valence of those plural considerations. Built on a four-task setup (Generation, Relevance, Valence, Explanation) and distilled from GPT-4, ValuePrism is validated by diverse human annotators and shown to cover a wide range of topics, including many rights from the UDHR. KaleidoSYS expands this by producing diverse, high-quality value sets and demonstrates strong performance relative to the teacher across coverage, accuracy, and interpretability, with notable zero-shot generalization to ETHICS and CommonsenseNormBank benchmarks. The work also provides an interpretable decision mechanism (KaleidoDec) and shows that output entropy can indicate decision variability, offering a principled way to surface and manage pluralistic reasoning in AI. Overall, ValuePrism and Kaleido constitute a concrete, openly accessible step toward modeling and leveraging plural human values in AI decision-making, with explicit consideration of diversity and transparency.

Abstract

Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect their feelings, how does one balance honesty with friendship?). As statistical learners, AI systems fit to averages by default, washing out these potentially irreducible value conflicts. To improve AI systems to better reflect value pluralism, the first-order challenge is to explore the extent to which AI systems can model pluralistic human values, rights, and duties as well as their interaction. We introduce ValuePrism, a large-scale dataset of 218k values, rights, and duties connected to 31k human-written situations. ValuePrism's contextualized values are generated by GPT-4 and deemed high-quality by human annotators 91% of the time. We conduct a large-scale study with annotators across diverse social and demographic backgrounds to try to understand whose values are represented. With ValuePrism, we build Kaleido, an open, light-weight, and structured language-based multi-task model that generates, explains, and assesses the relevance and valence (i.e., support or oppose) of human values, rights, and duties within a specific context. Humans prefer the sets of values output by our system over the teacher GPT-4, finding them more accurate and with broader coverage. In addition, we demonstrate that Kaleido can help explain variability in human decision-making by outputting contrasting values. Finally, we show that Kaleido's representations transfer to other philosophical frameworks and datasets, confirming the benefit of an explicit, modular, and interpretable approach to value pluralism. We hope that our work will serve as a step to making more explicit the implicit values behind human decision-making and to steering AI systems to make decisions that are more in accordance with them.

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

TL;DR

This work tackles value pluralism in AI by introducing ValuePrism, a large-scale dataset of 218k contextualized values, rights, and duties drawn from 31k real-world situations, and Kaleido, a multi-task language model that generates, explains, and assesses the relevance and valence of those plural considerations. Built on a four-task setup (Generation, Relevance, Valence, Explanation) and distilled from GPT-4, ValuePrism is validated by diverse human annotators and shown to cover a wide range of topics, including many rights from the UDHR. KaleidoSYS expands this by producing diverse, high-quality value sets and demonstrates strong performance relative to the teacher across coverage, accuracy, and interpretability, with notable zero-shot generalization to ETHICS and CommonsenseNormBank benchmarks. The work also provides an interpretable decision mechanism (KaleidoDec) and shows that output entropy can indicate decision variability, offering a principled way to surface and manage pluralistic reasoning in AI. Overall, ValuePrism and Kaleido constitute a concrete, openly accessible step toward modeling and leveraging plural human values in AI decision-making, with explicit consideration of diversity and transparency.

Abstract

Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect their feelings, how does one balance honesty with friendship?). As statistical learners, AI systems fit to averages by default, washing out these potentially irreducible value conflicts. To improve AI systems to better reflect value pluralism, the first-order challenge is to explore the extent to which AI systems can model pluralistic human values, rights, and duties as well as their interaction. We introduce ValuePrism, a large-scale dataset of 218k values, rights, and duties connected to 31k human-written situations. ValuePrism's contextualized values are generated by GPT-4 and deemed high-quality by human annotators 91% of the time. We conduct a large-scale study with annotators across diverse social and demographic backgrounds to try to understand whose values are represented. With ValuePrism, we build Kaleido, an open, light-weight, and structured language-based multi-task model that generates, explains, and assesses the relevance and valence (i.e., support or oppose) of human values, rights, and duties within a specific context. Humans prefer the sets of values output by our system over the teacher GPT-4, finding them more accurate and with broader coverage. In addition, we demonstrate that Kaleido can help explain variability in human decision-making by outputting contrasting values. Finally, we show that Kaleido's representations transfer to other philosophical frameworks and datasets, confirming the benefit of an explicit, modular, and interpretable approach to value pluralism. We hope that our work will serve as a step to making more explicit the implicit values behind human decision-making and to steering AI systems to make decisions that are more in accordance with them.
Paper Structure (113 sections, 1 equation, 15 figures, 22 tables, 1 algorithm)

This paper contains 113 sections, 1 equation, 15 figures, 22 tables, 1 algorithm.

Figures (15)

  • Figure 1: Different human values relate, support, or oppose everyday situations to varying degrees. Kaleido is designed to generate, explain, and assess how the pluralistic human values, rights, and duties may shape human judgments.
  • Figure 2: KaleidoSYS system workflow that includes 1) generating 100 values, rights and duties; 2) filtering by relevance as rated by Kaleido; 3) removing repetitive items; and computing relevance and valence scores for each value, right, and duty.
  • Figure 3: The output entropy of KaleidoDec is predictive of ambiguity in MoralChoice and controversialness in SocialChem. A threshold is chosen to maximize F1-score.
  • Figure 4: By sweeping KaleidoSYS's parameters, we are able to trade precision for recall (w.r.t. to the GPT-4 generated test split of ValuePrism) and output many more (or fewer) values, rights, and duties.
  • Figure 5: Kaleido is sensitive to subtle changes in inputs, changing relevance and valence scores accordingly.
  • ...and 10 more figures