Table of Contents
Fetching ...

Benchmarking Stochastic Approximation Algorithms for Fairness-Constrained Training of Deep Neural Networks

Andrii Kliachkin, Jana Lepšová, Gilles Bareilles, Jakub Mareček

TL;DR

This work tackles the problem of training deep neural networks under fairness constraints at scale by introducing a real-world benchmark based on Folktables (US Census) and comparing three practical stochastic constrained-ERM algorithms. It frames fairness through independence, separation, and sufficiency constraints and surveys related in-processing approaches, highlighting the lack of standardized tools for fair constrained training. The authors implement and benchmark Stochastic Ghost, SSL-ALM, and Stochastic Switching Subgradient alongside baselines, demonstrating that augmented-Lagrangian–based methods offer favorable trade-offs between objective minimization and constraint satisfaction, with stochastic methods exhibiting varying stability. The study provides a publicly available Python toolbox, enabling reproducible evaluation of new methods on large-scale fairness problems, and offers insights into the practical challenges and hyperparameter sensitivities of fairness-constrained learning in real data.

Abstract

The ability to train Deep Neural Networks (DNNs) with constraints is instrumental in improving the fairness of modern machine-learning models. Many algorithms have been analysed in recent years, and yet there is no standard, widely accepted method for the constrained training of DNNs. In this paper, we provide a challenging benchmark of real-world large-scale fairness-constrained learning tasks, built on top of the US Census (Folktables). We point out the theoretical challenges of such tasks and review the main approaches in stochastic approximation algorithms. Finally, we demonstrate the use of the benchmark by implementing and comparing three recently proposed, but as-of-yet unimplemented, algorithms both in terms of optimization performance, and fairness improvement. We release the code of the benchmark as a Python package at https://github.com/humancompatible/train.

Benchmarking Stochastic Approximation Algorithms for Fairness-Constrained Training of Deep Neural Networks

TL;DR

This work tackles the problem of training deep neural networks under fairness constraints at scale by introducing a real-world benchmark based on Folktables (US Census) and comparing three practical stochastic constrained-ERM algorithms. It frames fairness through independence, separation, and sufficiency constraints and surveys related in-processing approaches, highlighting the lack of standardized tools for fair constrained training. The authors implement and benchmark Stochastic Ghost, SSL-ALM, and Stochastic Switching Subgradient alongside baselines, demonstrating that augmented-Lagrangian–based methods offer favorable trade-offs between objective minimization and constraint satisfaction, with stochastic methods exhibiting varying stability. The study provides a publicly available Python toolbox, enabling reproducible evaluation of new methods on large-scale fairness problems, and offers insights into the practical challenges and hyperparameter sensitivities of fairness-constrained learning in real data.

Abstract

The ability to train Deep Neural Networks (DNNs) with constraints is instrumental in improving the fairness of modern machine-learning models. Many algorithms have been analysed in recent years, and yet there is no standard, widely accepted method for the constrained training of DNNs. In this paper, we provide a challenging benchmark of real-world large-scale fairness-constrained learning tasks, built on top of the US Census (Folktables). We point out the theoretical challenges of such tasks and review the main approaches in stochastic approximation algorithms. Finally, we demonstrate the use of the benchmark by implementing and comparing three recently proposed, but as-of-yet unimplemented, algorithms both in terms of optimization performance, and fairness improvement. We release the code of the benchmark as a Python package at https://github.com/humancompatible/train.

Paper Structure

This paper contains 43 sections, 19 equations, 5 figures, 12 tables, 3 algorithms.

Figures (5)

  • Figure 1: Train (blue) and test (orange) statistics over time (s) on the ACS Income dataset for each algorithm: SGD (column 1), fairret-regularized SGD (column 2), SSL-ALM (column 3), ALM (column 4) Switching Subgradient (column 5), and Stochastic Ghost (column 6). The plots depict the mean values for loss (first row) and the constraint at each timestamp, rounded to the nearest 0.5 seconds, over 10 runs. The shaded area depicts the region between the first and third quartiles.
  • Figure 2: Train (blue) and test (orange) statistics over time (s) on the ACS Income dataset for each algorithm: SGD (column 1), regularized SGD (column 2), SSL-ALM (column 3), Switching Subgradient (column 4), and Stochastic Ghost (column 5). The plots depict the mean values for loss (first row) and constraints (second to last row) at each timestamp, rounded to the nearest 0.5 seconds, over 10 runs. The shaded area depicts the region between the first and third quartiles.
  • Figure 3: Distribution of predictions for each algorithm. Left to right: SGD, SGD-Fairret, SSL-ALM, ALM, SSw, StGh, Blue and red denote "white" and "non-white" groups.
  • Figure 4: Average value of the three fairness metrics (independence (Ind), separation (Sp), and sufficiency (Sf)), along with mean inaccuracy (Ina), and difference in accuracy between the two groups (DifAcc). For all metrics, smaller values are better.
  • Figure 5: Loss (top row) and constraint (bottom row) evolution of the Stochastic Ghost on the validation dataset with different hyperparameter choices. The line corresponds to the mean value over 5 runs, the shaded region - to the area between the 1st and 3rd quartiles over 5 runs.