Table of Contents
Fetching ...

Quota-based debiasing can decrease representation of already underrepresented groups

Ivan Smirnov, Florian Lemmerich, Markus Strohmaier

TL;DR

The paper addresses the Debiasing Paradox: quota-based debiasing on a single binary attribute can worsen representation for already underrepresented subgroups when other correlated attributes are ignored. It introduces a theoretical model with two correlated binary attributes and perceived quality $\hat{q} = q - d_{color} I^{color} - d_{shape} I^{shape}$, showing a condition $d_{shape} > -d_{color} /(1 - 2f)$ under which debiasing on one attribute harms the most disadvantaged group. The authors validate the phenomenon across four real-world domains—education, wages, scientific citations, and recidivism—finding that quotas often decrease representation for certain subgroups and can reduce overall ranking fairness. They advocate for addressing root causes of inequality rather than relying on numeric quota solutions, and provide publicly available code to reproduce the results.

Abstract

Many important decisions in societies such as school admissions, hiring, or elections are based on the selection of top-ranking individuals from a larger pool of candidates. This process is often subject to biases, which typically manifest as an under-representation of certain groups among the selected or accepted individuals. The most common approach to this issue is debiasing, for example via the introduction of quotas that ensure proportional representation of groups with respect to a certain, often binary attribute. Cases include quotas for women on corporate boards or ethnic quotas in elections. This, however, has the potential to induce changes in representation with respect to other attributes. For the case of two correlated binary attributes we show that quota-based debiasing based on a single attribute can worsen the representation of already underrepresented groups and decrease overall fairness of selection. We use several data sets from a broad range of domains from recidivism risk assessments to scientific citations to assess this effect in real-world settings. Our results demonstrate the importance of including all relevant attributes in debiasing procedures and that more efforts need to be put into eliminating the root causes of inequalities as purely numerical solutions such as quota-based debiasing might lead to unintended consequences.

Quota-based debiasing can decrease representation of already underrepresented groups

TL;DR

The paper addresses the Debiasing Paradox: quota-based debiasing on a single binary attribute can worsen representation for already underrepresented subgroups when other correlated attributes are ignored. It introduces a theoretical model with two correlated binary attributes and perceived quality , showing a condition under which debiasing on one attribute harms the most disadvantaged group. The authors validate the phenomenon across four real-world domains—education, wages, scientific citations, and recidivism—finding that quotas often decrease representation for certain subgroups and can reduce overall ranking fairness. They advocate for addressing root causes of inequality rather than relying on numeric quota solutions, and provide publicly available code to reproduce the results.

Abstract

Many important decisions in societies such as school admissions, hiring, or elections are based on the selection of top-ranking individuals from a larger pool of candidates. This process is often subject to biases, which typically manifest as an under-representation of certain groups among the selected or accepted individuals. The most common approach to this issue is debiasing, for example via the introduction of quotas that ensure proportional representation of groups with respect to a certain, often binary attribute. Cases include quotas for women on corporate boards or ethnic quotas in elections. This, however, has the potential to induce changes in representation with respect to other attributes. For the case of two correlated binary attributes we show that quota-based debiasing based on a single attribute can worsen the representation of already underrepresented groups and decrease overall fairness of selection. We use several data sets from a broad range of domains from recidivism risk assessments to scientific citations to assess this effect in real-world settings. Our results demonstrate the importance of including all relevant attributes in debiasing procedures and that more efforts need to be put into eliminating the root causes of inequalities as purely numerical solutions such as quota-based debiasing might lead to unintended consequences.

Paper Structure

This paper contains 4 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: The Debiasing Paradox. If only one attribute (color) is considered then orange entities appear to have an advantage as their average perceived quality is higher (a). In fact, being orange is a disadvantage by construction (c). In this case, while debiasing on color seems to eliminate color bias (b), it, in fact, affects various subgroups differently (d). In particular, it worsens the representation of the already most disadvantaged group of orange stars. We call this effect the debiasing paradox.
  • Figure 2: Illustration of the Debiasing Paradox with real-world data along with model approximations. In all cases, debiasing decreases representation for some of the already underrepresented groups (panels e-h). In some cases debiasing decreases the representation of the most underrepresented group (panel e). Model approximation leads to qualitatively the same results for the changes in representations of groups (decreased on increased representation) implying relevance of our model for the wide range of real-world distributions.
  • Figure 3: The effects of debiasing on the quality of selected candidates for uncorrelated (a) and correlated (b) attributes. When shape and color are uncorrelated and there is no bias for shape (red solid line on panel a) then debiasing on color successfully maximizes the quality of selected candidates. If there exists bias for shape then the maximum value is not achieved but quality is still improved. However, if two attributes are correlated then in some cases overall quality decreases (shaded regions on panel b).
  • Figure S1: Changes in representation of the most underrepresented group. Green color corresponds to the increased representation and red to the decreased representation. The black line corresponds to the analytically derived curve $d_{shape} = -d_{color} / (1 - 2f)$