Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset
Lily Hong Zhang, Smitha Milli, Karen Jusko, Jonathan Smith, Brandon Amos, Wassim Bouaziz, Manon Revel, Jack Kussman, Yasha Sheynin, Lisa Titus, Bhaktipriya Radharapu, Jane Yu, Vidya Sarma, Kris Rose, Maximilian Nickel
TL;DR
This work addresses the challenge of aligning large language models to diverse global user preferences by showing that humans exhibit substantial variation along core value axes (Inglehart-Welzel dimensions) that is not captured by current model outputs. It identifies an algorithmic monoculture in responses from 21 LLMs and demonstrates that existing preference-collection methods produce largely homogeneous candidate sets, hindering learning of heterogeneous preferences. The authors propose negatively-correlated sampling (NC) as a simple, effective prompting strategy to generate more diverse candidate responses, which significantly improves downstream learning across IW values for standard alignment methods. Building on this, they collect Community Alignment (CA), the largest open-source, multilingual, multi-turn preference dataset to date (~200,000 comparisons from 3,196 annotators across five countries), featuring NC sampling, non-English data, free-form explanations, and prompt-level annotator overlap. CA is designed to enable new analyses and methods for pluralistic alignment, with implications for improving LLM usefulness across a globally diverse population. 3–5 sentences summarizing the problem, approach, key contributions, and practical impact: NC sampling reveals gaps in current alignment pipelines, CA provides a rich resource for developing pluralistic alignment techniques, and the broader impact lies in enabling LLMs to better serve diverse users while highlighting the need for diverse data collection practices in AI systems.
Abstract
How can large language models (LLMs) serve users with varying preferences that may conflict across cultural, political, or other dimensions? To advance this challenge, this paper establishes four key results. First, we demonstrate, through a large-scale multilingual human study with representative samples from five countries (N=15,000), that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs. Second, we show that existing methods for preference dataset collection are insufficient for learning the diversity of human preferences even along two of the most salient dimensions of variability in global values, due to the underlying homogeneity of candidate responses. Third, we argue that this motivates the need for negatively-correlated sampling when generating candidate sets, and we show that simple prompt-based techniques for doing so significantly enhance the performance of alignment methods in learning heterogeneous preferences. Fourth, based on this novel candidate sampling approach, we collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date, featuring almost 200,000 comparisons from annotators spanning five countries. We hope that the Community Alignment dataset will be a valuable resource for improving the effectiveness of LLMs for a diverse global population.
