Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
Taylor Sorensen, Liwei Jiang, Jena Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, Yejin Choi
TL;DR
This work tackles value pluralism in AI by introducing ValuePrism, a large-scale dataset of 218k contextualized values, rights, and duties drawn from 31k real-world situations, and Kaleido, a multi-task language model that generates, explains, and assesses the relevance and valence of those plural considerations. Built on a four-task setup (Generation, Relevance, Valence, Explanation) and distilled from GPT-4, ValuePrism is validated by diverse human annotators and shown to cover a wide range of topics, including many rights from the UDHR. KaleidoSYS expands this by producing diverse, high-quality value sets and demonstrates strong performance relative to the teacher across coverage, accuracy, and interpretability, with notable zero-shot generalization to ETHICS and CommonsenseNormBank benchmarks. The work also provides an interpretable decision mechanism (KaleidoDec) and shows that output entropy can indicate decision variability, offering a principled way to surface and manage pluralistic reasoning in AI. Overall, ValuePrism and Kaleido constitute a concrete, openly accessible step toward modeling and leveraging plural human values in AI decision-making, with explicit consideration of diversity and transparency.
Abstract
Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect their feelings, how does one balance honesty with friendship?). As statistical learners, AI systems fit to averages by default, washing out these potentially irreducible value conflicts. To improve AI systems to better reflect value pluralism, the first-order challenge is to explore the extent to which AI systems can model pluralistic human values, rights, and duties as well as their interaction. We introduce ValuePrism, a large-scale dataset of 218k values, rights, and duties connected to 31k human-written situations. ValuePrism's contextualized values are generated by GPT-4 and deemed high-quality by human annotators 91% of the time. We conduct a large-scale study with annotators across diverse social and demographic backgrounds to try to understand whose values are represented. With ValuePrism, we build Kaleido, an open, light-weight, and structured language-based multi-task model that generates, explains, and assesses the relevance and valence (i.e., support or oppose) of human values, rights, and duties within a specific context. Humans prefer the sets of values output by our system over the teacher GPT-4, finding them more accurate and with broader coverage. In addition, we demonstrate that Kaleido can help explain variability in human decision-making by outputting contrasting values. Finally, we show that Kaleido's representations transfer to other philosophical frameworks and datasets, confirming the benefit of an explicit, modular, and interpretable approach to value pluralism. We hope that our work will serve as a step to making more explicit the implicit values behind human decision-making and to steering AI systems to make decisions that are more in accordance with them.
