Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification

Haohua Dong; Ana Manzano Rodríguez; Camille Guinaudeau; Shin'ichi Satoh

Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification

Haohua Dong, Ana Manzano Rodríguez, Camille Guinaudeau, Shin'ichi Satoh

TL;DR

This paper tackles bias in face gender classification arising from unbalanced training data and labels a demographic attribute-free setting. It introduces pseudo-balancing, a lightweight strategy that enforces demographic parity during pseudo-label selection in semi-supervised learning, using unlabeled data from a race-balanced source like FairFace. Across two scenarios, the method improves overall accuracy and substantially reduces gender disparities on the All-Age-Faces benchmark, notably narrowing East Asian subgroup gaps, while avoiding explicit demographic annotations. The work demonstrates the practicality of leveraging balanced unlabeled data to debias computer vision models and outlines limitations under severe data skew, offering directions for future enhancements and broader application.

Abstract

Face gender classification models often reflect and amplify demographic biases present in their training data, leading to uneven performance across gender and racial subgroups. We introduce pseudo-balancing, a simple and effective strategy for mitigating such biases in semi-supervised learning. Our method enforces demographic balance during pseudo-label selection, using only unlabeled images from a race-balanced dataset without requiring access to ground-truth annotations. We evaluate pseudo-balancing under two conditions: (1) fine-tuning a biased gender classifier using unlabeled images from the FairFace dataset, and (2) stress-testing the method with intentionally imbalanced training data to simulate controlled bias scenarios. In both cases, models are evaluated on the All-Age-Faces (AAF) benchmark, which contains a predominantly East Asian population. Our results show that pseudo-balancing consistently improves fairness while preserving or enhancing accuracy. The method achieves 79.81% overall accuracy - a 6.53% improvement over the baseline - and reduces the gender accuracy gap by 44.17%. In the East Asian subgroup, where baseline disparities exceeded 49%, the gap is narrowed to just 5.01%. These findings suggest that even in the absence of label supervision, access to a demographically balanced or moderately skewed unlabeled dataset can serve as a powerful resource for debiasing existing computer vision models.

Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification

TL;DR

Abstract

Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)