Prompts to Proxies: Emulating Human Preferences via a Compact LLM Ensemble
Bingchen Wang, Zi-Yu Khoo, Jingtan Wang
TL;DR
Prompts to Proxies introduces preference reconstruction theory to align LLM proxies with target populations by constructing a functional basis of proxy agents and weighting them to reproduce observed survey responses. The two-stage P2P pipeline builds a diverse agent pool via entropy-guided prompts and selects a compact ensemble via L1-regularized regression to match observed distributions without demographic data or fine-tuning. Empirical results across 14 ATP waves and the World Values Survey show improved distributional fidelity and favorable cost relative to prompting baselines, with robust cross-locale generalization and a stress test against SFT baselines under topic shift. The work advances pluralistic alignment for social science simulations and points to future extensions to freeform outputs, non-stationary preferences, and model steerability benchmarks.
Abstract
Large language models are increasingly used as proxies for human subjects in social science research, yet external validity requires that synthetic agents faithfully reflect the preferences of target human populations. We introduce *preference reconstruction theory*, a framework that formalizes preference alignment as a representation learning problem: constructing a functional basis of proxy agents and recovering population preferences through weighted aggregation. We implement this via *Prompts to Proxies* ($\texttt{P2P}$), a modular two-stage system. Stage 1 uses structured prompting with entropy-based adaptive sampling to construct a diverse agent pool spanning the latent preference space. Stage 2 employs L1-regularized regression to select a compact ensemble whose aggregate response distributions align with observed data from the target population. $\texttt{P2P}$ requires no finetuning and no access to sensitive demographic data, incurring only API inference costs. We validate the approach on 14 waves of the American Trends Panel, achieving an average test MSE of 0.014 across diverse topics at approximately 0.8 USD per survey. We additionally test it on the World Values Survey, demonstrating its potential to generalize across locales. When stress-tested against an SFT-aligned baseline, $\texttt{P2P}$ achieves competitive performance using less than 3% of the training data.
