Table of Contents
Fetching ...

An Effective Theory of Bias Amplification

Arjun Subramonian, Samuel J. Bell, Levent Sagun, Elvis Dohmatob

TL;DR

The paper addresses bias amplification in machine learning by building a unifying theory for ridge regression in settings with and without random projections, modeling data as a two-group Gaussian mixture. It deploys operator-valued free probability theory to obtain deterministic equivalents for groupwise test-risk disparities (EDD, ODD) and the amplification metric ADD, across diverse parameterization, data-covariance structures, and group sizes. The authors derive a bias–variance decomposition R_s( f̂ ) ≈ B_s( f̂ ) + V_s( f̂ ), with V_s and B_s expressed via fixed-point scalars that capture inter-group covariance interactions, and they validate the theory through extensive synthetic and semi-synthetic experiments, including isotropic covariances and Colored MNIST. The work reveals phase transitions and regimes where regularization or early stopping can mitigate bias, and it provides actionable insights for evaluating and mitigating unfairness in ML, such as how overparameterization and feature composition affect minority-group performance.

Abstract

Machine learning models can capture and amplify biases present in data, leading to disparate test performance across social groups. To better understand, evaluate, and mitigate these biases, a deeper theoretical understanding of how model design choices and data distribution properties contribute to bias is needed. In this work, we contribute a precise analytical theory in the context of ridge regression, both with and without random projections, where the former models feedforward neural networks in a simplified regime. Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias in various feature and parameter regimes. For example, we observe that there may be an optimal regularization penalty or training time to avoid bias amplification, and there can be differences in test error between groups that are not alleviated with increased parameterization. Importantly, our theoretical predictions align with empirical observations reported in the literature on machine learning bias. We extensively empirically validate our theory on synthetic and semi-synthetic datasets.

An Effective Theory of Bias Amplification

TL;DR

The paper addresses bias amplification in machine learning by building a unifying theory for ridge regression in settings with and without random projections, modeling data as a two-group Gaussian mixture. It deploys operator-valued free probability theory to obtain deterministic equivalents for groupwise test-risk disparities (EDD, ODD) and the amplification metric ADD, across diverse parameterization, data-covariance structures, and group sizes. The authors derive a bias–variance decomposition R_s( f̂ ) ≈ B_s( f̂ ) + V_s( f̂ ), with V_s and B_s expressed via fixed-point scalars that capture inter-group covariance interactions, and they validate the theory through extensive synthetic and semi-synthetic experiments, including isotropic covariances and Colored MNIST. The work reveals phase transitions and regimes where regularization or early stopping can mitigate bias, and it provides actionable insights for evaluating and mitigating unfairness in ML, such as how overparameterization and feature composition affect minority-group performance.

Abstract

Machine learning models can capture and amplify biases present in data, leading to disparate test performance across social groups. To better understand, evaluate, and mitigate these biases, a deeper theoretical understanding of how model design choices and data distribution properties contribute to bias is needed. In this work, we contribute a precise analytical theory in the context of ridge regression, both with and without random projections, where the former models feedforward neural networks in a simplified regime. Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias in various feature and parameter regimes. For example, we observe that there may be an optimal regularization penalty or training time to avoid bias amplification, and there can be differences in test error between groups that are not alleviated with increased parameterization. Importantly, our theoretical predictions align with empirical observations reported in the literature on machine learning bias. We extensively empirically validate our theory on synthetic and semi-synthetic datasets.

Paper Structure

This paper contains 60 sections, 9 theorems, 187 equations, 15 figures.

Key Result

Theorem 3.1

Under Assumptions ass:commute and ass:scaling-random-proj, it holds that $R_s (\widehat{f}) \simeq B_s (\widehat{f}) + V_s (\widehat{f})$, with

Figures (15)

  • Figure 1: $ODD$, $EDD$, and $ADD$ phase diagrams for ridge regression with random projections. We plot the bias amplification phase diagrams with respect to $\phi$ (rate of features to samples) and $\psi$ (rate of parameters to samples), as predicted by our theory for ridge regression with random projections (Theorems \ref{['thm:odd-theory-rand-proj']}, \ref{['thm:edd-theory-rand-proj']}). Red regions indicate theoretical predictions greater than 1 (i.e., bias amplification in the rightmost plot), while blue regions indicate theoretical predictions less than 1 (i.e., bias deamplification in the rightmost plot). Darkness indicates intensity. We consider isotropic covariance matrices: $\Sigma_1 = 2 I_d, \Sigma_2 = I_d$, $\Theta = 2 I_d$, $\Delta = I_d$. Additionally, $n = 1 \times 10^4, \sigma_1^2 = \sigma_2^2 = 1$. We further choose $\lambda = \lambda_1 = \lambda_2 = 1 \times 10^{-6}$ to approximate the minimum-norm interpolator. We show that bias amplification can occur even in the balanced data setting, i.e., when $p_1 = p_2 = 1/2$.
  • Figure 2: Our theory predicts that models can amplify bias even with balanced groups and without spurious correlations. We empirically validate our theory (Theorems \ref{['thm:odd-theory-rand-proj']} and \ref{['thm:edd-theory-rand-proj']}) for $ODD$, $EDD$, and $ADD$ under the setup described in Section \ref{['sec:bias-amp-iso-setup']}, with $a_1 = 0.5, a_2 = 1, \sigma_1^2 = 1$, and $\sigma_2^2 = 1 \times 10^{-5}$. The solid lines capture empirical values while the corresponding lower-opacity dashed lines represent what our theory predicts. We plot $ODD$ and $EDD$ on the same scale for easy comparison, and include a black dashed line at $ADD = 1$ to contrast bias amplification vs. deamplification. We include all the plots with error bars in Appendix \ref{['sec:bias-amp-plots']}.
  • Figure 3: Our theory predicts that disparate label noise between groups deamplifies bias on Colored MNIST. We plot the $ODD$ and $EDD$ of a CNN over training time $t$ for Colored MNIST. As $t$ increases, the $ODD$ is relatively low while the $EDD$ is noticeably higher. The error bars capture the standard deviation computed over 10 random seeds.
  • Figure 4: Minority-group test risk can peak with different model sizes depending on the rate of features to samples. We empirically demonstrate that minority-group bias is affected by extraneous features. We validate our theory (Theorems \ref{['thm:odd-theory-rand-proj']} and \ref{['thm:edd-theory-rand-proj']}) for together $R_1, R_2$ (i.e., single model learned for both groups) and separate $R_1, R_2$ (i.e., separate model learned per group) under the setup described in Section \ref{['sec:over-time-setup']}, with $a_1 = 2, b_2 = 0.2$, and $\pi = 0.5$. The solid lines capture empirical values while the corresponding lower-opacity dashed lines represent what our theory predicts. We include a black dashed line at $ADD = 1$ to contrast bias amplification vs. deamplification. All y-axes are on the same scale for easy comparison. All the plots with error bars are in Appendix \ref{['sec:spurious-plots']}.
  • Figure 5: $ODD$, $EDD$, and $ADD$ phase diagrams for classical ridge regression. We plot the bias amplification phase diagrams with respect to $\phi$ (rate of features to samples), as predicted by our theory for ridge regression without random projections (Theorems \ref{['thm:odd-theory']}, \ref{['thm:edd-theory']}). Dashed black lines indicate theoretical predictions. We consider isotropic covariance matrices: $\Sigma_1 = 2 I_d, \Sigma_2 = I_d$, $\Theta = 2 I_d$, $\Delta = I_d$. Additionally, $n = 1 \times 10^4, \sigma_1^2 = \sigma_2^2 = 1$. We further choose $\lambda = \lambda_1 = \lambda_2 = 1 \times 10^{-6}$ to approximate the minimum-norm interpolator. We observe that bias amplification can occur even in the balanced data setting, i.e., when $p_1 = p_2 = 1/2$, without spurious correlations.
  • ...and 10 more figures

Theorems & Definitions (23)

  • Definition 2.1: Bias Amplification
  • Definition 3.1
  • Theorem 3.1
  • Theorem 3.2
  • Definition D.1
  • Theorem D.1
  • Definition D.2
  • Theorem D.2
  • proof
  • Lemma E.1
  • ...and 13 more