Survey Transfer Learning: Recycling Data with Silicon Responses
Ali Amini
TL;DR
The paper tackles the environmental and methodological drawbacks of using large language models to generate synthetic survey data. It proposes Survey Transfer Learning (STL), which reuses gold-standard survey data (CES and ANES) and transfers learned demographic–partisan structure via Anchor Transfer Variables in a three-stage backbone–head neural network to produce empirically grounded silicon responses. STL achieves strong cross-survey performance, e.g., $AUC \approx 0.97$ for vote prediction and distributional fidelity with $KS < 0.03$ and $Wasserstein < 0.03$, outperforming LLM-generated data and traditional imputation on sensitive measures like racial resentment. The approach offers a sustainable, transparent alternative for missing data imputation and cross-survey augmentation, enabling reproducible research while reducing environmental impact. It also frames surveys as interconnected data resources, paving the way for broader cross-survey integration and methodological innovations in political science.
Abstract
As researchers increasingly turn to large language models (LLMs) to generate synthetic survey data, less attention has been paid to alternative AI paradigms given environmental costs of LLMs. This paper introduces Survey Transfer Learning (STL), which develops transfer learning paradigms from computer science for survey research to recycle existing survey data and generate empirically grounded silicon responses. Inspired by political behavior theory, STL leverages shared demographic variables with high predictive power in a polarized American context to transfer knowledge across surveys. Using a neural network pre-trained on the Cooperative Election Study (CES) 2020, freezing early layers to preserve learned structure, and fine-tuning top layers on the American National Election Studies (ANES) 2020, STL generates silicon responses CES 2022 and in held-out ANES 2020 data with accuracy rates of up to 93 percent. Results show that STL outperforms LLMs, especially on sensitive measures such as racial resentment. While LLMs silicon samples are costly and opaque, STL generates empirically grounded silicon responses with high individual-level accuracy, potentially helping to mitigate key challenges in social science and the polling industry.
