Table of Contents
Fetching ...

CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models

Giada Pistilli, Alina Leidinger, Yacine Jernite, Atoosa Kasirzadeh, Alexandra Sasha Luccioni, Margaret Mitchell

TL;DR

CIVICS introduces a hand-crafted, multilingual dataset for evaluating culturally informed values in large language models across five languages and nine national contexts. The dataset is curated from official sources and annotated with fine-grained, rights-based labels, enabling two evaluation modalities: next-token log-probabilities and long-form responses. Through six-stage annotation and rigorous cross-language experiments on open-weight LLMs, the paper reveals substantial cross-linguistic and topic-based variation in model behavior, including interpretable refusals and topic-specific disagreements, with notable patterns for LGBTQI rights and immigration. The work emphasizes reproducibility, transparency, and ethical considerations, offering a framework to study value pluralism in AI and encouraging broader, globally inclusive evaluations beyond English-centric benchmarks. CIVICS and its demo resources are released openly to support future research in culturally aware AI alignment and evaluation across diverse linguistic communities.

Abstract

This paper introduces the "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset, designed to evaluate the social and cultural variation of Large Language Models (LLMs) across multiple languages and value-sensitive topics. We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy. CIVICS is designed to generate responses showing LLMs' encoded and implicit values. Through our dynamic annotation processes, tailored prompt design, and experiments, we investigate how open-weight LLMs respond to value-sensitive issues, exploring their behavior across diverse linguistic and cultural contexts. Using two experimental set-ups based on log-probabilities and long-form responses, we show social and cultural variability across different LLMs. Specifically, experiments involving long-form responses demonstrate that refusals are triggered disparately across models, but consistently and more frequently in English or translated statements. Moreover, specific topics and sources lead to more pronounced differences across model answers, particularly on immigration, LGBTQI rights, and social welfare. As shown by our experiments, the CIVICS dataset aims to serve as a tool for future research, promoting reproducibility and transparency across broader linguistic settings, and furthering the development of AI technologies that respect and reflect global cultural diversities and value pluralism. The CIVICS dataset and tools will be made available upon publication under open licenses; an anonymized version is currently available at https://huggingface.co/CIVICS-dataset.

CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models

TL;DR

CIVICS introduces a hand-crafted, multilingual dataset for evaluating culturally informed values in large language models across five languages and nine national contexts. The dataset is curated from official sources and annotated with fine-grained, rights-based labels, enabling two evaluation modalities: next-token log-probabilities and long-form responses. Through six-stage annotation and rigorous cross-language experiments on open-weight LLMs, the paper reveals substantial cross-linguistic and topic-based variation in model behavior, including interpretable refusals and topic-specific disagreements, with notable patterns for LGBTQI rights and immigration. The work emphasizes reproducibility, transparency, and ethical considerations, offering a framework to study value pluralism in AI and encouraging broader, globally inclusive evaluations beyond English-centric benchmarks. CIVICS and its demo resources are released openly to support future research in culturally aware AI alignment and evaluation across diverse linguistic communities.

Abstract

This paper introduces the "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset, designed to evaluate the social and cultural variation of Large Language Models (LLMs) across multiple languages and value-sensitive topics. We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy. CIVICS is designed to generate responses showing LLMs' encoded and implicit values. Through our dynamic annotation processes, tailored prompt design, and experiments, we investigate how open-weight LLMs respond to value-sensitive issues, exploring their behavior across diverse linguistic and cultural contexts. Using two experimental set-ups based on log-probabilities and long-form responses, we show social and cultural variability across different LLMs. Specifically, experiments involving long-form responses demonstrate that refusals are triggered disparately across models, but consistently and more frequently in English or translated statements. Moreover, specific topics and sources lead to more pronounced differences across model answers, particularly on immigration, LGBTQI rights, and social welfare. As shown by our experiments, the CIVICS dataset aims to serve as a tool for future research, promoting reproducibility and transparency across broader linguistic settings, and furthering the development of AI technologies that respect and reflect global cultural diversities and value pluralism. The CIVICS dataset and tools will be made available upon publication under open licenses; an anonymized version is currently available at https://huggingface.co/CIVICS-dataset.
Paper Structure (79 sections, 5 figures, 9 tables)

This paper contains 79 sections, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Baseline experiment 1 -- Larger base models yielded more variation and increased "disagree" labels.
  • Figure 2: Distribution of model refusals on the topics Immigration and LGBTQI rights, by model, fine-grained labels (top), and statement region and language (bottom).
  • Figure 3: Comparing ratings for the two proposed methods in Sections \ref{['sec:exp:logits']} and \ref{['sec:exp:open_ended']}, with ratings given by a majority vote between different framings of the statement, shows similarities in the topics and languages triggering the most disagreements.
  • Figure 4: Refusal rates for all topics except surrogacy.
  • Figure 5: Value labels and organizations with the most variation answers across models