Table of Contents
Fetching ...

Fairness and Bias Mitigation in Computer Vision: A Survey

Sepehr Dehdashtian, Ruozhen He, Yi Li, Guha Balakrishnan, Nuno Vasconcelos, Vicente Ordonez, Vishnu Naresh Boddeti

TL;DR

This survey addresses the problem of fairness and bias in computer vision by linking CV-specific biases to broader fair ML concepts and providing a comprehensive taxonomy of origins, definitions, discovery methods, and mitigation strategies. It systematically reviews how biases arise from data, models, and task contexts, and it catalogs a wide range of mitigation approaches, including fair representation learning, counterfactual data rebalancing, and score calibration, with emphasis on both classification and multimodal tasks. The paper also surveys datasets and resources used to quantify and reduce bias, highlighting task- and attribute-specific diversity and the current limitations of prebaked models, including foundation and generative models. By synthesizing trends, trade-offs, and benchmark practices, the work offers a clear reference point for researchers and practitioners to design fairer CV systems and identifies open challenges in aligning fairness with performance in diverse, real-world settings.

Abstract

Computer vision systems have witnessed rapid progress over the past two decades due to multiple advances in the field. As these systems are increasingly being deployed in high-stakes real-world applications, there is a dire need to ensure that they do not propagate or amplify any discriminatory tendencies in historical or human-curated data or inadvertently learn biases from spurious correlations. This paper presents a comprehensive survey on fairness that summarizes and sheds light on ongoing trends and successes in the context of computer vision. The topics we discuss include 1) The origin and technical definitions of fairness drawn from the wider fair machine learning literature and adjacent disciplines. 2) Work that sought to discover and analyze biases in computer vision systems. 3) A summary of methods proposed to mitigate bias in computer vision systems in recent years. 4) A comprehensive summary of resources and datasets produced by researchers to measure, analyze, and mitigate bias and enhance fairness. 5) Discussion of the field's success, continuing trends in the context of multimodal foundation and generative models, and gaps that still need to be addressed. The presented characterization should help researchers understand the importance of identifying and mitigating bias in computer vision and the state of the field and identify potential directions for future research.

Fairness and Bias Mitigation in Computer Vision: A Survey

TL;DR

This survey addresses the problem of fairness and bias in computer vision by linking CV-specific biases to broader fair ML concepts and providing a comprehensive taxonomy of origins, definitions, discovery methods, and mitigation strategies. It systematically reviews how biases arise from data, models, and task contexts, and it catalogs a wide range of mitigation approaches, including fair representation learning, counterfactual data rebalancing, and score calibration, with emphasis on both classification and multimodal tasks. The paper also surveys datasets and resources used to quantify and reduce bias, highlighting task- and attribute-specific diversity and the current limitations of prebaked models, including foundation and generative models. By synthesizing trends, trade-offs, and benchmark practices, the work offers a clear reference point for researchers and practitioners to design fairer CV systems and identifies open challenges in aligning fairness with performance in diverse, real-world settings.

Abstract

Computer vision systems have witnessed rapid progress over the past two decades due to multiple advances in the field. As these systems are increasingly being deployed in high-stakes real-world applications, there is a dire need to ensure that they do not propagate or amplify any discriminatory tendencies in historical or human-curated data or inadvertently learn biases from spurious correlations. This paper presents a comprehensive survey on fairness that summarizes and sheds light on ongoing trends and successes in the context of computer vision. The topics we discuss include 1) The origin and technical definitions of fairness drawn from the wider fair machine learning literature and adjacent disciplines. 2) Work that sought to discover and analyze biases in computer vision systems. 3) A summary of methods proposed to mitigate bias in computer vision systems in recent years. 4) A comprehensive summary of resources and datasets produced by researchers to measure, analyze, and mitigate bias and enhance fairness. 5) Discussion of the field's success, continuing trends in the context of multimodal foundation and generative models, and gaps that still need to be addressed. The presented characterization should help researchers understand the importance of identifying and mitigating bias in computer vision and the state of the field and identify potential directions for future research.
Paper Structure (23 sections, 5 equations, 4 figures, 3 tables)

This paper contains 23 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Examples of Bias and Unfairness in discriminative and generative computer vision systems. Left: Bias in discriminative modeling shown through face recognition buolamwini2018gender and Situation Recognition zhao2017menyatskar2017commonly examples. Right: Stereotypical bias in generative modeling with examples from three cultures and three professions bianchi2023easily.
  • Figure 2: Dependence graphs dehdashtian2024fairerclip illustrating how biases from (a) inherent relations and (b) spurious correlations arise.
  • Figure 3: Fair Representation Learning. An encoder $f$ maps images to a representation $Z$. A target branch maximizes the statistical dependence between $Z$ and $Y$, while a fairness branch minimizes the statistical dependence between $Z$ and the protected attribute $S$. Methods in this class differ in the choice of loss functions $L_Y$, $L_S$, models for $f_Y$ and $f_S$, and learning (iterative vs closed-form, local vs global optima).
  • Figure 4: The utility-fairness trade-offs.dehdashtian2024utilityfairness (Left) Models can be evaluated by their utility (e.g., accuracy, MSE loss, F1 score, etc.) w.r.t. a target label $Y$ and their unfairness w.r.t. a sensitive attribute $S$. dehdashtian2024utilityfairness introduce two trade-offs, Data Space Trade-Off (DST) and Label Space Trade-Off (LST). (Right) dehdashtian2024utilityfairness empirically estimate DST and LST on CelebA and evaluate the utility (high cheekbones) and fairness (gender & age) of over 100 zero-shot and 900 supervised image models.