Table of Contents
Fetching ...

(Unfair) Norms in Fairness Research: A Meta-Analysis

Jennifer Chien, A. Stevie Bergman, Kevin R. McKee, Nenad Tomasev, Vinodkumar Prabhakaran, Rida Qadri, Nahema Marchal, William Isaac

TL;DR

This paper conducts a reflexive meta-analysis of algorithmic fairness research from AIES and FAccT (2018–2022) to reveal embedded norms shaping fairness work. It finds a pronounced US-centric bias in author affiliations and data provenance, coupled with widespread use of binary formulations for sensitive attributes, which together risk narrowing the field’s relevance to global contexts. The authors argue for a paradigm shift toward more inclusive, context-specific, and intersectional approaches, incorporating transparency, broader representation, participatory methods, and ethnographic work. The study highlights the practical impact of normative choices on the design and evaluation of fair AI systems and calls for expanding the disciplinary and geographic scope of fairness research to better serve diverse populations.

Abstract

Algorithmic fairness has emerged as a critical concern in artificial intelligence (AI) research. However, the development of fair AI systems is not an objective process. Fairness is an inherently subjective concept, shaped by the values, experiences, and identities of those involved in research and development. To better understand the norms and values embedded in current fairness research, we conduct a meta-analysis of algorithmic fairness papers from two leading conferences on AI fairness and ethics, AIES and FAccT, covering a final sample of 139 papers over the period from 2018 to 2022. Our investigation reveals two concerning trends: first, a US-centric perspective dominates throughout fairness research; and second, fairness studies exhibit a widespread reliance on binary codifications of human identity (e.g., "Black/White", "male/female"). These findings highlight how current research often overlooks the complexities of identity and lived experiences, ultimately failing to represent diverse global contexts when defining algorithmic bias and fairness. We discuss the limitations of these research design choices and offer recommendations for fostering more inclusive and representative approaches to fairness in AI systems, urging a paradigm shift that embraces nuanced, global understandings of human identity and values.

(Unfair) Norms in Fairness Research: A Meta-Analysis

TL;DR

This paper conducts a reflexive meta-analysis of algorithmic fairness research from AIES and FAccT (2018–2022) to reveal embedded norms shaping fairness work. It finds a pronounced US-centric bias in author affiliations and data provenance, coupled with widespread use of binary formulations for sensitive attributes, which together risk narrowing the field’s relevance to global contexts. The authors argue for a paradigm shift toward more inclusive, context-specific, and intersectional approaches, incorporating transparency, broader representation, participatory methods, and ethnographic work. The study highlights the practical impact of normative choices on the design and evaluation of fair AI systems and calls for expanding the disciplinary and geographic scope of fairness research to better serve diverse populations.

Abstract

Algorithmic fairness has emerged as a critical concern in artificial intelligence (AI) research. However, the development of fair AI systems is not an objective process. Fairness is an inherently subjective concept, shaped by the values, experiences, and identities of those involved in research and development. To better understand the norms and values embedded in current fairness research, we conduct a meta-analysis of algorithmic fairness papers from two leading conferences on AI fairness and ethics, AIES and FAccT, covering a final sample of 139 papers over the period from 2018 to 2022. Our investigation reveals two concerning trends: first, a US-centric perspective dominates throughout fairness research; and second, fairness studies exhibit a widespread reliance on binary codifications of human identity (e.g., "Black/White", "male/female"). These findings highlight how current research often overlooks the complexities of identity and lived experiences, ultimately failing to represent diverse global contexts when defining algorithmic bias and fairness. We discuss the limitations of these research design choices and offer recommendations for fostering more inclusive and representative approaches to fairness in AI systems, urging a paradigm shift that embraces nuanced, global understandings of human identity and values.
Paper Structure (23 sections, 10 figures, 3 tables)

This paper contains 23 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Conference, Study Type, and Dataset Topic Domains Over Time. (a) The number of papers published by each conference trends upward over time. (b) Over all years, a majority of papers are retrospective (conducting empirical analyses on pre-existing datasets). (c) Papers examine an increasing variety of topic domains over time. Overall, finance generally prevails as the most popular dataset domain application, followed by criminal justice.
  • Figure 2: Distribution of Author Country-Affiliation and Dataset Country-Affiliation. These cartograms represent country sizes proportional to the count of (a) author affiliations and (b) datasets attributed to each country. The US emerges as the most highly represented country for both authorship origin and data provenance. Graphics created with gastner2018fast.
  • Figure 3: Sensitive Attributes Studied Over Time. Count reflects the number of times papers in our sample analyzed each sensitive attribute category in a given year.
  • Figure 4: Distribution of Label Formulations for Sensitive Attributes Across Fairness Studies. This alluvial diagram illustrates the range of formulations that fairness studies use to label sensitive attributes. Each colored band represents, on the left, one of the three most frequently studied sensitive attributes in our sample, and on the right, the number of studies utilizing a particular formulation. The grey bars in the middle indicate the number of unique categories employed within each formulation. Studies predominantly formulate gender as a "female/male" binary. Across race, the top two formulations are "Black/White" and "African-American/non-African-American". Finally, age shows some increased diversity in the number of categories considered, though papers still exhibit a strong tendency towards binary formulations (e.g., 0-24 and 25+ or 0-64 and 65+).
  • Figure 5: Binary Formulations for Sensitive Attributes. This alluvial diagram highlights the binary label formulations used for the three most studied sensitive attributes (gender, race, and age) within fairness studies. Each colored band represents, on the left, one of the three most frequently studied sensitive attributes in our sample, and on the right, the number of studies utilizing a particular formulation. The grey bars in the middle indicate the number of unique categories employed within each formulation.
  • ...and 5 more figures