Table of Contents
Fetching ...

SubData: Bridging Heterogeneous Datasets to Enable Theory-Driven Evaluation of Political and Demographic Perspectives in LLMs

Pietro Bernardelle, Leon Fröhling, Stefano Civelli, Gianluca Demartini

TL;DR

This work tackles the challenge of evaluating how large language models reflect diverse human perspectives across demographic and ideological lines, a task hindered by inconsistent datasets and annotations. It introduces SubData, an open-source library to standardize heterogeneous hate-speech datasets under a unified taxonomy, enabling reliable cross-dataset comparisons. Coupled with a theory-driven hypothesis-testing framework, the approach tests how differently aligned LLMs classify content, using controlled experiments rather than contested ground-truth labels. The paper demonstrates the framework through an empirical use case with persona-conditioned hate-speech detection and discusses extensions, limitations, and ethical considerations for broader, multi-construct benchmarking in the future.

Abstract

As increasingly capable large language models (LLMs) emerge, researchers have begun exploring their potential for subjective tasks. While recent work demonstrates that LLMs can be aligned with diverse human perspectives, evaluating this alignment on downstream tasks (e.g., hate speech detection) remains challenging due to the use of inconsistent datasets across studies. To address this issue, in this resource paper we propose a two-step framework: we (1) introduce SubData, an open-source Python library designed for standardizing heterogeneous datasets to evaluate LLMs perspective alignment; and (2) present a theory-driven approach leveraging this library to test how differently-aligned LLMs (e.g., aligned with different political viewpoints) classify content targeting specific demographics. SubData's flexible mapping and taxonomy enable customization for diverse research needs, distinguishing it from existing resources. We illustrate its usage with an example application and invite contributions to extend our initial release into a multi-construct benchmark suite for evaluating LLMs perspective alignment on natural language processing tasks.

SubData: Bridging Heterogeneous Datasets to Enable Theory-Driven Evaluation of Political and Demographic Perspectives in LLMs

TL;DR

This work tackles the challenge of evaluating how large language models reflect diverse human perspectives across demographic and ideological lines, a task hindered by inconsistent datasets and annotations. It introduces SubData, an open-source library to standardize heterogeneous hate-speech datasets under a unified taxonomy, enabling reliable cross-dataset comparisons. Coupled with a theory-driven hypothesis-testing framework, the approach tests how differently aligned LLMs classify content, using controlled experiments rather than contested ground-truth labels. The paper demonstrates the framework through an empirical use case with persona-conditioned hate-speech detection and discusses extensions, limitations, and ethical considerations for broader, multi-construct benchmarking in the future.

Abstract

As increasingly capable large language models (LLMs) emerge, researchers have begun exploring their potential for subjective tasks. While recent work demonstrates that LLMs can be aligned with diverse human perspectives, evaluating this alignment on downstream tasks (e.g., hate speech detection) remains challenging due to the use of inconsistent datasets across studies. To address this issue, in this resource paper we propose a two-step framework: we (1) introduce SubData, an open-source Python library designed for standardizing heterogeneous datasets to evaluate LLMs perspective alignment; and (2) present a theory-driven approach leveraging this library to test how differently-aligned LLMs (e.g., aligned with different political viewpoints) classify content targeting specific demographics. SubData's flexible mapping and taxonomy enable customization for diverse research needs, distinguishing it from existing resources. We illustrate its usage with an example application and invite contributions to extend our initial release into a multi-construct benchmark suite for evaluating LLMs perspective alignment on natural language processing tasks.

Paper Structure

This paper contains 37 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of our proposed evaluation framework. SubData consolidates instances from diverse datasets into a unified resource. To assess LLM alignment with human perspectives from the combined dataset, we propose a workflow that tests theory-derived (T) hypotheses (H) through controlled experiments (E), measuring how accurately LLMs reflect viewpoints of different demographic and ideological groups.
  • Figure 2: SubData taxonomy structure with target groups organized by category. Note: targets that should end in "_unspecified" have been abbreviated in the figure using "'_unsp."