DAIQ: Auditing Demographic Attribute Inference from Question in LLMs
Srikant Panda, Hitesh Laxmichand Patel, Shahad Al-Khalifa, Amit Agarwal, Hend Al-Khalifa, Sharefah Al-Ghamdi
TL;DR
DAIQ introduces a diagnostic framework to audit whether LLMs infer demographic attributes from demographically neutral questions, reframing inference as an epistemic overreach and treating abstention as normative under uncertainty. Evaluating 18 models across six real-world domains and five attributes, the study finds pervasive, often stereotype-aligned inference, with responses conditioned by inferred demographics acting as latent modifiers of downstream content. The work demonstrates that abstention-oriented prompting can substantially curb unintended demographic inference without fine-tuning, and shows that ongoing model priors—not decoding randomness—drive these effects. It advocates for evaluation standards that assess not only how models respond to demographic cues but whether they should infer such cues at all, highlighting implications for privacy, fairness, and robust deployment.
Abstract
Recent evaluations of Large language models (LLMs) audit social bias primarily through prompts that explicitly reference demographic attributes, overlooking whether models infer sensitive demographics from neutral questions. Such inference constitutes epistemic overreach and raises concerns for privacy. We introduce Demographic Attribute Inference from Questions (DAIQ), a diagnostic audit framework for evaluating demographic inference under epistemic uncertainty. We evaluate 18 open- and closed-source LLMs across six real-world domains and five demographic attributes. We find that many models infer demographics from neutral questions, defaulting to socially dominant categories and producing stereotype-aligned rationales. These behaviors persist across model families, scales and decoding settings, indicating reliance on learned population priors. We further show that inferred demographics can condition downstream responses and that abstention oriented prompting substantially reduces unintended inference without model fine-tuning. Our results suggest that current bias evaluations are incomplete and motivate evaluation standards that assess not only how models respond to demographic information, but whether they should infer it at all.
