The Impact of Group Membership Bias on the Quality and Fairness of Exposure in Ranking
Ali Vardasbi, Maarten de Rijke, Fernando Diaz, Mostafa Dehghani
TL;DR
The study tackles group membership bias in learning-to-rank, showing that user-perceived group membership can distort attractiveness signals and degrade both ranking quality and merit-based exposure fairness. It develops a two-group analytical framework with a multiplicative bias $P(A|q,d,g)=β_g · P(R|q,d)$ and a bias fraction $ν$, deriving that changes in $\text{NDCG}$ and fairness metrics (EEL, DTR) scale with $ν$ and that $E[ν]=\max(2 - β_A^{-1},0)$. To correct bias without enforcing equality, it proposes an IPS-based amortized correction: estimate $\hat{β}_A$ via distribution matching of attractiveness across aggregated queries using a KS test, and apply exposure corrections across query clusters. Empirical results across diverse tabular and general LTR datasets show that group bias harms both head and tail queries and that the amortized correction substantially recovers ranking quality and fairness, often achieving near full-information performance when distribution assumptions hold and clustering is accurate. This work advances practical fair exposure in ranking by providing a principled measurement-correction pipeline validated across regimes and datasets.
Abstract
When learning to rank from user interactions, search and recommender systems must address biases in user behavior to provide a high-quality ranking. One type of bias that has recently been studied in the ranking literature is when sensitive attributes, such as gender, have an impact on a user's judgment about an item's utility. For example, in a search for an expertise area, some users may be biased towards clicking on male candidates over female candidates. We call this type of bias group membership bias. Increasingly, we seek rankings that are fair to individuals and sensitive groups. Merit-based fairness measures rely on the estimated utility of the items. With group membership bias, the utility of the sensitive groups is under-estimated, hence, without correcting for this bias, a supposedly fair ranking is not truly fair. In this paper, first, we analyze the impact of group membership bias on ranking quality as well as merit-based fairness metrics and show that group membership bias can hurt both ranking and fairness. Then, we provide a correction method for group bias that is based on the assumption that the utility score of items in different groups comes from the same distribution. This assumption has two potential issues of sparsity and equality-instead-of-equity; we use an amortized approach to address these. We show that our correction method can consistently compensate for the negative impact of group membership bias on ranking quality and fairness metrics.
