Controlling Gender Bias in Retrieval via a Backpack Architecture
Amirabbas Afzali, Amirreza Velae, Iman Ahmadi, Mohammad Aliannejadi
TL;DR
The paper tackles gender bias in retrieval and ranking systems that leverage large language models. It introduces a bias-controllable ranking framework built on Backpack Language Models that reweights sense vectors during inference to control a targeted social attribute, enabling in situ fairness adjustments. The key contributions are the Sense-Attribute Alignment procedure and a per-sense reweighting mechanism governed by a fairness parameter $\lambda$, which together reduce gender bias while preserving ranking performance on MS MARCO. This approach offers a practical, deployment-friendly method for fair information retrieval without retraining and suggests broad applicability to other biases and larger models.
Abstract
The presence of social biases in large language models (LLMs) has become a significant concern in AI research. These biases, often embedded in training data, can perpetuate harmful stereotypes and distort decision-making processes. When LLMs are integrated into ranking systems, they can propagate these biases, leading to unfair outcomes in critical applications such as search engines and recommendation systems. Backpack Language Models, unlike traditional transformer-based models that treat text sequences as monolithic structures, generate outputs as weighted combinations of non-contextual, learned word aspects, also known as senses. Leveraging this architecture, we propose a framework for debiasing ranking tasks. Our experimental results show that this framework effectively mitigates gender bias in text retrieval and ranking with minimal degradation in performance.
