Table of Contents
Fetching ...

Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models

Ivan Luiz De Moura Matos, Abdel Djalil Sad Saoud, Ekaterina Iakovleva, Vito Paolo Pastore, Enzo Tartaglione

TL;DR

This work introduces Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates "bias-free" subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters.

Abstract

The issue of algorithmic biases in deep learning has led to the development of various debiasing techniques, many of which perform complex training procedures or dataset manipulation. However, an intriguing question arises: is it possible to extract fair and bias-agnostic subnetworks from standard vanilla-trained models without relying on additional data, such as unbiased training set? In this work, we introduce Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates "bias-free" subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters. Our approach demonstrates that such subnetworks can be extracted via pruning and can operate without modification, effectively relying less on biased features and maintaining robust performance. Our findings contribute towards efficient bias mitigation through structural adaptation of pre-trained neural networks via parameter removal, as opposed to costly strategies that are either data-centric or involve (re)training all model parameters. Extensive experiments on common benchmarks show the advantages of our approach in terms of the performance and computational efficiency of the resulting debiased model.

Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models

TL;DR

This work introduces Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates "bias-free" subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters.

Abstract

The issue of algorithmic biases in deep learning has led to the development of various debiasing techniques, many of which perform complex training procedures or dataset manipulation. However, an intriguing question arises: is it possible to extract fair and bias-agnostic subnetworks from standard vanilla-trained models without relying on additional data, such as unbiased training set? In this work, we introduce Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates "bias-free" subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters. Our approach demonstrates that such subnetworks can be extracted via pruning and can operate without modification, effectively relying less on biased features and maintaining robust performance. Our findings contribute towards efficient bias mitigation through structural adaptation of pre-trained neural networks via parameter removal, as opposed to costly strategies that are either data-centric or involve (re)training all model parameters. Extensive experiments on common benchmarks show the advantages of our approach in terms of the performance and computational efficiency of the resulting debiased model.
Paper Structure (30 sections, 18 equations, 10 figures, 21 tables, 1 algorithm)

This paper contains 30 sections, 18 equations, 10 figures, 21 tables, 1 algorithm.

Figures (10)

  • Figure 1: Overview of the proposed method. BISE aims to extract an unbiased subnetwork from the biased vanilla-trained network.
  • Figure 2: Illustration of BISE. Solid black arrows indicate forward propagation; dashed blue arrows indicate backward gradient propagation. During the training of the mask $\mathcal{M}$ and of $\mathcal{C}_\text{aux}$, the original model, $f = \mathcal{C}\circ\mathcal{E}$, is kept frozen.
  • Figure 3: Analysis of (a) coefficient $\gamma$ and (b) pruning strategies. Shaded areas indicate the interval of one standard deviation.
  • Figure C.1: Samples from the BiasedMNIST dataset bahng2020learning. In the first row, bias-aligned images; in the second row, bias-conflicting images.
  • Figure C.2: Samples from the CelebA dataset liu2015deep.
  • ...and 5 more figures