In-Context Bias Propagation in LLM-Based Tabular Data Generation
Pol G. Recasens, Alberto Gutierrez, Jordi Torres, Josep. Ll Berral, Javier Carnerero-Cano, Anisa Halimi, Kieran Fraser
TL;DR
This work shows that in-context biases in prompts can systematically propagate to LLM-generated tabular data, distorting downstream fairness across univariate, conditional, and intersectional dimensions. The authors model the synthetic distribution as a two-component mixture and quantify drift with a drift score to reveal linear propagation patterns that intensify with larger context windows. They introduce adversarial in-context bias injection as a safety risk, demonstrating that a small fraction of biased demonstrations can degrade fairness in downstream classifiers while preserving utility. They also evaluate in-context preprocessing defenses and find partial mitigation, underscoring the need for stronger, more robust safeguards in LLM-based synthetic data pipelines.
Abstract
Large Language Models (LLMs) are increasingly used for synthetic tabular data generation through in-context learning (ICL), offering a practical solution for data augmentation in data scarce scenarios. While prior work has shown the potential of LLMs to improve downstream task performance through augmenting underrepresented groups, these benefits often assume access to a subset of unbiased in-context examples, representative of the real dataset. In real-world settings, however, data is frequently noisy and demographically skewed. In this paper, we systematically study how statistical biases within in-context examples propagate to the distribution of synthetic tabular data, showing that even mild in-context biases lead to global statistical distortions. We further introduce an adversarial scenario where a malicious contributor can inject bias into the synthetic dataset via a subset of in-context examples, ultimately compromising the fairness of downstream classifiers for a targeted and protected subgroup. Finally, we evaluate mitigation strategies based on preprocessing in-context examples, demonstrating that while such interventions can attenuate disparity, the inherent sensitivity of LLMs to adversarial prompts remains a persistent challenge. Our findings highlight a critical new vulnerability in LLM-based data generation pipelines within sensitive domains.
