Table of Contents
Fetching ...

The Impact of Group Membership Bias on the Quality and Fairness of Exposure in Ranking

Ali Vardasbi, Maarten de Rijke, Fernando Diaz, Mostafa Dehghani

TL;DR

The study tackles group membership bias in learning-to-rank, showing that user-perceived group membership can distort attractiveness signals and degrade both ranking quality and merit-based exposure fairness. It develops a two-group analytical framework with a multiplicative bias $P(A|q,d,g)=β_g · P(R|q,d)$ and a bias fraction $ν$, deriving that changes in $\text{NDCG}$ and fairness metrics (EEL, DTR) scale with $ν$ and that $E[ν]=\max(2 - β_A^{-1},0)$. To correct bias without enforcing equality, it proposes an IPS-based amortized correction: estimate $\hat{β}_A$ via distribution matching of attractiveness across aggregated queries using a KS test, and apply exposure corrections across query clusters. Empirical results across diverse tabular and general LTR datasets show that group bias harms both head and tail queries and that the amortized correction substantially recovers ranking quality and fairness, often achieving near full-information performance when distribution assumptions hold and clustering is accurate. This work advances practical fair exposure in ranking by providing a principled measurement-correction pipeline validated across regimes and datasets.

Abstract

When learning to rank from user interactions, search and recommender systems must address biases in user behavior to provide a high-quality ranking. One type of bias that has recently been studied in the ranking literature is when sensitive attributes, such as gender, have an impact on a user's judgment about an item's utility. For example, in a search for an expertise area, some users may be biased towards clicking on male candidates over female candidates. We call this type of bias group membership bias. Increasingly, we seek rankings that are fair to individuals and sensitive groups. Merit-based fairness measures rely on the estimated utility of the items. With group membership bias, the utility of the sensitive groups is under-estimated, hence, without correcting for this bias, a supposedly fair ranking is not truly fair. In this paper, first, we analyze the impact of group membership bias on ranking quality as well as merit-based fairness metrics and show that group membership bias can hurt both ranking and fairness. Then, we provide a correction method for group bias that is based on the assumption that the utility score of items in different groups comes from the same distribution. This assumption has two potential issues of sparsity and equality-instead-of-equity; we use an amortized approach to address these. We show that our correction method can consistently compensate for the negative impact of group membership bias on ranking quality and fairness metrics.

The Impact of Group Membership Bias on the Quality and Fairness of Exposure in Ranking

TL;DR

The study tackles group membership bias in learning-to-rank, showing that user-perceived group membership can distort attractiveness signals and degrade both ranking quality and merit-based exposure fairness. It develops a two-group analytical framework with a multiplicative bias and a bias fraction , deriving that changes in and fairness metrics (EEL, DTR) scale with and that . To correct bias without enforcing equality, it proposes an IPS-based amortized correction: estimate via distribution matching of attractiveness across aggregated queries using a KS test, and apply exposure corrections across query clusters. Empirical results across diverse tabular and general LTR datasets show that group bias harms both head and tail queries and that the amortized correction substantially recovers ranking quality and fairness, often achieving near full-information performance when distribution assumptions hold and clustering is accurate. This work advances practical fair exposure in ranking by providing a principled measurement-correction pipeline validated across regimes and datasets.

Abstract

When learning to rank from user interactions, search and recommender systems must address biases in user behavior to provide a high-quality ranking. One type of bias that has recently been studied in the ranking literature is when sensitive attributes, such as gender, have an impact on a user's judgment about an item's utility. For example, in a search for an expertise area, some users may be biased towards clicking on male candidates over female candidates. We call this type of bias group membership bias. Increasingly, we seek rankings that are fair to individuals and sensitive groups. Merit-based fairness measures rely on the estimated utility of the items. With group membership bias, the utility of the sensitive groups is under-estimated, hence, without correcting for this bias, a supposedly fair ranking is not truly fair. In this paper, first, we analyze the impact of group membership bias on ranking quality as well as merit-based fairness metrics and show that group membership bias can hurt both ranking and fairness. Then, we provide a correction method for group bias that is based on the assumption that the utility score of items in different groups comes from the same distribution. This assumption has two potential issues of sparsity and equality-instead-of-equity; we use an amortized approach to address these. We show that our correction method can consistently compensate for the negative impact of group membership bias on ranking quality and fairness metrics.
Paper Structure (20 sections, 3 theorems, 10 equations, 9 figures, 2 tables)

This paper contains 20 sections, 3 theorems, 10 equations, 9 figures, 2 tables.

Key Result

theorem 1

In the presence of group bias, for uniformly distributed attractiveness scores, the change in the NDCG of the list, sorted based on items' attractiveness, can be approximated by a linear function of $\mathbb{E}_{}\left[\nu\right]$, i.e., the fraction of affected relevant items that are still as attr

Figures (9)

  • Figure 1: The effect of group bias on user clicks.
  • Figure 2: The impact of group bias on ranking performance (left) and DTR fairness metric (right) for the Yahoo! dataset.
  • Figure 3: Contrasting per-query and amortized correction.
  • Figure 4: Ratio of affected to non-affected group members in terms of population and average utility score (relevance) for different sensitive attributes in Yahoo! and MSLR datasets.
  • Figure 5: The impact of group bias on ranking quality for the Yahoo! and MSLR datasets with different sensitive attributes.
  • ...and 4 more figures

Theorems & Definitions (8)

  • Remark 1
  • Remark 2
  • Remark 3
  • theorem 1
  • theorem 2
  • theorem 3
  • Remark 4
  • Remark 5