Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
NaHyeon Park, Namin An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim
TL;DR
The paper reveals that LVLM-based text-to-image models exhibit stronger demographic biases than non-LVLM systems, and that system prompts are a key mechanism driving this bias. To enable robust evaluation, the authors create a 1,024-prompt benchmark spanning four linguistic levels and measure bias with LVLM-based judges, showing a strong link between prompt complexity, demographic cues, and alignment. They provide a mechanistic analysis demonstrating that system prompts encode demographic priors that shift token probabilities and text embeddings, ultimately biasing image synthesis. As a practical contribution, they introduce FairPro, a training-free meta-prompting framework that self-audits prompts and replaces the system prompt with a fairness-aware version at test time, achieving substantial bias reduction while preserving text–image alignment across two LVLM-based T2I models. The work highlights the pivotal role of system prompts in bias propagation and offers a deployable method to build more socially responsible LVLM-based generative systems.
Abstract
Large vision-language model (LVLM) based text-to-image (T2I) systems have become the dominant paradigm in image generation, yet whether they amplify social biases remains insufficiently understood. In this paper, we show that LVLM-based models produce markedly more socially biased images than non-LVLM-based models. We introduce a 1,024 prompt benchmark spanning four levels of linguistic complexity and evaluate demographic bias across multiple attributes in a systematic manner. Our analysis identifies system prompts, the predefined instructions guiding LVLMs, as a primary driver of biased behavior. Through decoded intermediate representations, token-probability diagnostics, and embedding-association analyses, we reveal how system prompts encode demographic priors that propagate into image synthesis. To this end, we propose FairPro, a training-free meta-prompting framework that enables LVLMs to self-audit and construct fairness-aware system prompts at test time. Experiments on two LVLM-based T2I models, SANA and Qwen-Image, show that FairPro substantially reduces demographic bias while preserving text-image alignment. We believe our findings provide deeper insight into the central role of system prompts in bias propagation and offer a practical, deployable approach for building more socially responsible T2I systems.
