Table of Contents
Fetching ...

Hyper-parameter Tuning for Fair Classification without Sensitive Attribute Access

Akshaj Kumar Veldanda, Ivan Brugere, Sanghamitra Dutta, Alan Mishler, Siddharth Garg

TL;DR

Antigone addresses the challenge of achieving fair classification without access to sensitive attributes on training or validation data. It generates pseudo-sensitive attributes from an ERM baseline and selects the best labeller by maximizing the Euclidean distance between class means (EDM), underpinned by a mutually corrupted noise model. The PSA labels then guide hyperparameter tuning for popular fair-ML methods (e.g., JTT, AFR, GEORGE, ARL), yielding substantial gains in worst-group accuracy and fairness gaps across CelebA, Waterbirds, and UCI Adult, often approaching ground-truth SA-tuned baselines. The approach scales to large backbones (e.g., ViT) and offers a practical, unsupervised pathway to fair classification in privacy-conscious or SA-inaccessible settings.

Abstract

Fair machine learning methods seek to train models that balance model performance across demographic subgroups defined over sensitive attributes like race and gender. Although sensitive attributes are typically assumed to be known during training, they may not be available in practice due to privacy and other logistical concerns. Recent work has sought to train fair models without sensitive attributes on training data. However, these methods need extensive hyper-parameter tuning to achieve good results, and hence assume that sensitive attributes are known on validation data. However, this assumption too might not be practical. Here, we propose Antigone, a framework to train fair classifiers without access to sensitive attributes on either training or validation data. Instead, we generate pseudo sensitive attributes on the validation data by training a biased classifier and using the classifier's incorrectly (correctly) labeled examples as proxies for minority (majority) groups. Since fairness metrics like demographic parity, equal opportunity and subgroup accuracy can be estimated to within a proportionality constant even with noisy sensitive attribute information, we show theoretically and empirically that these proxy labels can be used to maximize fairness under average accuracy constraints. Key to our results is a principled approach to select the hyper-parameters of the biased classifier in a completely unsupervised fashion (meaning without access to ground truth sensitive attributes) that minimizes the gap between fairness estimated using noisy versus ground-truth sensitive labels.

Hyper-parameter Tuning for Fair Classification without Sensitive Attribute Access

TL;DR

Antigone addresses the challenge of achieving fair classification without access to sensitive attributes on training or validation data. It generates pseudo-sensitive attributes from an ERM baseline and selects the best labeller by maximizing the Euclidean distance between class means (EDM), underpinned by a mutually corrupted noise model. The PSA labels then guide hyperparameter tuning for popular fair-ML methods (e.g., JTT, AFR, GEORGE, ARL), yielding substantial gains in worst-group accuracy and fairness gaps across CelebA, Waterbirds, and UCI Adult, often approaching ground-truth SA-tuned baselines. The approach scales to large backbones (e.g., ViT) and offers a practical, unsupervised pathway to fair classification in privacy-conscious or SA-inaccessible settings.

Abstract

Fair machine learning methods seek to train models that balance model performance across demographic subgroups defined over sensitive attributes like race and gender. Although sensitive attributes are typically assumed to be known during training, they may not be available in practice due to privacy and other logistical concerns. Recent work has sought to train fair models without sensitive attributes on training data. However, these methods need extensive hyper-parameter tuning to achieve good results, and hence assume that sensitive attributes are known on validation data. However, this assumption too might not be practical. Here, we propose Antigone, a framework to train fair classifiers without access to sensitive attributes on either training or validation data. Instead, we generate pseudo sensitive attributes on the validation data by training a biased classifier and using the classifier's incorrectly (correctly) labeled examples as proxies for minority (majority) groups. Since fairness metrics like demographic parity, equal opportunity and subgroup accuracy can be estimated to within a proportionality constant even with noisy sensitive attribute information, we show theoretically and empirically that these proxy labels can be used to maximize fairness under average accuracy constraints. Key to our results is a principled approach to select the hyper-parameters of the biased classifier in a completely unsupervised fashion (meaning without access to ground truth sensitive attributes) that minimizes the gap between fairness estimated using noisy versus ground-truth sensitive labels.
Paper Structure (40 sections, 2 theorems, 10 equations, 6 figures, 14 tables)

This paper contains 40 sections, 2 theorems, 10 equations, 6 figures, 14 tables.

Key Result

Proposition 2.1

fairness_mc_noise Under the ideal MC noise model in eq:mc_model, DP and EO gaps measured on the noisy datasets are proportional to the true DP and EO gaps. Mathematically:

Figures (6)

  • Figure 1: Antigone on CelebA dataset with hair color as target label and gender as (unknown) sensitive attribute. Blond men are discriminated against. Correspondingly, the mean image of the Blond class incorrect set (row 4) has more male features than that of its correct set (row 1), reflecting this bias. Similarly, a bias against non-blond women is also reflected. PSA $= 0$ corresponds to disadvantaged groups, and PSA $= 1$ corresponds to advantaged groups.
  • Figure 2: Euclidean Distance between Means (EDM) and noise parameters ($\alpha_{1}, \beta_{1}$ and $1-\alpha_{1}-\beta_{1}$) for the positive target class of Waterbirds dataset. Blue dot indicates the model picked by Antigone, while black star indicates the model that maximizes $1-\alpha_{1}-\beta_{1}$.
  • Figure 3: Figure (a) illustrates CelebA and Waterbirds datasets along with fraction of each sub-group examples in their respective training dataset. Figure (b) shows Euclidean Distance between Means (EDM) and noise parameters $\alpha_{1}, \beta_{1}$ and and $1-\alpha_{1}-\beta_{1}$ for the positive target class of CelebA dataset. The noise parameters are unknown in practice. Blue dot indicates the model that we pick to generate pseudo sensitive attributes, while black star indicates the model that maximizes $1-\alpha_{1}-\beta_{1}$.
  • Figure 4: Figure illustrates a strong positive correlation between EDM and $1-\alpha-\beta$ with a Pearson correlation coefficient of 0.90 and 0.95 for the (a) waterbirds and (b) landbirds classes, respectively, on the Waterbirds dataset.
  • Figure 5: Figure illustrates the performance of Antigone+GEORGE and GEORGE in terms of target label accuracy and WGA across multiple trials for both $k=5$ and $k=2$ on the CelebA dataset.
  • ...and 1 more figures

Theorems & Definitions (4)

  • Proposition 2.1
  • Lemma 2.2
  • proof
  • Remark 2.3