Evidence-based Distributional Alignment for Large Language Models

Viet-Thanh Pham; Lizhen Qu; Zhuang Li; Gholamreza Haffari

Evidence-based Distributional Alignment for Large Language Models

Viet-Thanh Pham, Lizhen Qu, Zhuang Li, Gholamreza Haffari

Abstract

Distributional alignment enables large language models (LLMs) to predict how a target population distributes its responses across answer options, rather than collapsing disagreement into a single consensus answer. However, existing LLM-based distribution prediction is often unstable and degrades under cultural and domain shift. Token score-based estimates can change with minor option wording or formatting, response sampling-based estimates are expensive and sensitive to prompts and decoding settings, and directly generated distributions are frequently miscalibrated. We propose Evi-DA, an evidence-based alignment technique that improves the fidelity and robustness of LLM-based distribution estimation under domain and cultural shift. Given a target country and a multiple-choice question, Evi-DA retrieves related World Values Survey items and their answer distributions, predicts a coarse Welzel value signature for each option, and infers the country-conditioned answer distribution in a structured format. We train the LLMs using a two-stage pipeline, where reinforcement learning optimizes survey-derived rewards that encourage accurate intermediate value predictions, faithful final distributions, well-formed structured outputs, and reduced cultural bias. Across in-domain and out-of-domain benchmarks and multiple open-source backbones, Evi-DA reduces Jensen-Shannon divergence between predicted and gold distributions relative to strong baselines, with average relative improvements of up to 44%.

Evidence-based Distributional Alignment for Large Language Models

Abstract

Paper Structure (46 sections, 9 equations, 3 figures, 5 tables)

This paper contains 46 sections, 9 equations, 3 figures, 5 tables.

Introduction
Method
Overview
Welzel values as a latent representation
Evidence bank construction from the World Values Survey
Retrieving evidence for input question
Two-stage inference with structured intermediate representations
Stage A: option value profiling.
Stage B: distribution prediction conditioned on value evidence.
Reinforcement learning with GRPO
Experiments
Evaluation Setup
Benchmarks.
Demographic Representation.
Evaluation Metric.
...and 31 more sections

Figures (3)

Figure 1: Illustration of Evi-DA - our proposed method for distributional alignment with LLMs.
Figure 2: Illustration of Evi-DA with different LLM backbones on the out-of-domain benchmark when changing the value $K$ for number of retrieved samples.
Figure 3: Illustration of Evi-DA with different Embedding Models for Retrieval on the out-of-domain benchmark.

Evidence-based Distributional Alignment for Large Language Models

Abstract

Evidence-based Distributional Alignment for Large Language Models

Authors

Abstract

Table of Contents

Figures (3)