Table of Contents
Fetching ...

DeNetDM: Debiasing by Network Depth Modulation

Silpa Vadakkeeveetil Sreelatha, Adarsh Kappiyath, Abhra Chaudhuri, Anjan Dutta

TL;DR

DeNetDM is presented, a novel debiasing method that uses network depth modulation as a way of developing robustness to spurious correlations and demonstrates that DeNetDM outperforms existing debiasing techniques on both synthetic and real-world datasets by 5\%.

Abstract

Neural networks trained on biased datasets tend to inadvertently learn spurious correlations, hindering generalization. We formally prove that (1) samples that exhibit spurious correlations lie on a lower rank manifold relative to the ones that do not; and (2) the depth of a network acts as an implicit regularizer on the rank of the attribute subspace that is encoded in its representations. Leveraging these insights, we present DeNetDM, a novel debiasing method that uses network depth modulation as a way of developing robustness to spurious correlations. Using a training paradigm derived from Product of Experts, we create both biased and debiased branches with deep and shallow architectures and then distill knowledge to produce the target debiased model. Our method requires no bias annotations or explicit data augmentation while performing on par with approaches that require either or both. We demonstrate that DeNetDM outperforms existing debiasing techniques on both synthetic and real-world datasets by 5\%. The project page is available at https://vssilpa.github.io/denetdm/.

DeNetDM: Debiasing by Network Depth Modulation

TL;DR

DeNetDM is presented, a novel debiasing method that uses network depth modulation as a way of developing robustness to spurious correlations and demonstrates that DeNetDM outperforms existing debiasing techniques on both synthetic and real-world datasets by 5\%.

Abstract

Neural networks trained on biased datasets tend to inadvertently learn spurious correlations, hindering generalization. We formally prove that (1) samples that exhibit spurious correlations lie on a lower rank manifold relative to the ones that do not; and (2) the depth of a network acts as an implicit regularizer on the rank of the attribute subspace that is encoded in its representations. Leveraging these insights, we present DeNetDM, a novel debiasing method that uses network depth modulation as a way of developing robustness to spurious correlations. Using a training paradigm derived from Product of Experts, we create both biased and debiased branches with deep and shallow architectures and then distill knowledge to produce the target debiased model. Our method requires no bias annotations or explicit data augmentation while performing on par with approaches that require either or both. We demonstrate that DeNetDM outperforms existing debiasing techniques on both synthetic and real-world datasets by 5\%. The project page is available at https://vssilpa.github.io/denetdm/.
Paper Structure (34 sections, 5 theorems, 38 equations, 5 figures, 12 tables, 1 algorithm)

This paper contains 34 sections, 5 theorems, 38 equations, 5 figures, 12 tables, 1 algorithm.

Key Result

Theorem 1

When the partitioning $X = X_a \cup X_c$ is stable wrt.$C$, the rank of the bias-aligned partition is upper-bounded by the rank of the bias-conflicting partition, i.e.,

Figures (5)

  • Figure 1: Illustration of the DeNetDM framework: In Stage 1, an ensemble of shallow and deep branches produces outputs linearly combined and trained as a product of experts. The cross-entropy loss with depth modulation aids in separating biases and identifying target attributes. In Stage 2, we further introduce a target branch with the desired architecture, which also requires debiasing. This phase exclusively focuses on refining the target branch's feature extractor ($\phi_{t}$) and classifier head ($f_{t}$) while leveraging knowledge from the initial stages.
  • Figure 2: Exploring the effect of depth modulation: (a) illustrates how the linear decodability of features decreases as neural network depth increases, while (b) dives into the training dynamics of MLPs with varying depths under ERM.
  • Figure 3: Early training dynamics of DeNetDM.
  • Figure 4: Early training dynamics of DeNetDM on C-CIFAR10 dataset.
  • Figure 5: Samples from training data of CMNIST, Corrupted-CIFAR10 and Biased FFHQ.

Theorems & Definitions (10)

  • Definition 1: Stability
  • Theorem 1: Partition Rank
  • Theorem 2: Depth-Rank Duality
  • proof
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • proof
  • Corollary 2.1