How Reasoning Influences Intersectional Biases in Vision Language Models
Adit Desai, Sudipta Roy, Mohna Chakraborty
TL;DR
This work investigates how reasoning in Vision Language Models contributes to intersectional biases in occupation prediction tasks. It introduces a framework that jointly collects predictions and natural-language reasoning from five open-source VLMs on 32 occupations using three prompting styles. The authors show that biases permeate both outputs and explanations, with demographic markers appearing in reasoning and post-hoc rationalizations persisting even when reasoning is included. They also demonstrate that model scale alters the quality and content of reasoning, yet biases remain a concern, underscoring the need to align VLM reasoning with human values before deployment and to develop mitigation strategies.
Abstract
Vision Language Models (VLMs) are increasingly deployed across downstream tasks, yet their training data often encode social biases that surface in outputs. Unlike humans, who interpret images through contextual and social cues, VLMs process them through statistical associations, often leading to reasoning that diverges from human reasoning. By analyzing how a VLM reasons, we can understand how inherent biases are perpetuated and can adversely affect downstream performance. To examine this gap, we systematically analyze social biases in five open-source VLMs for an occupation prediction task, on the FairFace dataset. Across 32 occupations and three different prompting styles, we elicit both predictions and reasoning. Our findings reveal that the biased reasoning patterns systematically underlie intersectional disparities, highlighting the need to align VLM reasoning with human values prior to its downstream deployment.
