Table of Contents
Fetching ...

Post-processing for Fair Regression via Explainable SVD

Zhiqun Zuo, Ding Zhu, Mohammad Mahdi Khalili

TL;DR

The paper tackles fair regression under statistical parity by post-processing neural network weights using an Explainable SVD (ESVDFair). By constructing moment-based disparity bounds through transformations WS_e and WS_v, the authors convert fairness constraints into convex, tractable optimization problems with closed-form-like solutions for adjusted singular values. The method sequentially reduces first- and second-moment disparities via covariance-aware and moment-based constraints, followed by a last-layer least-squares refinement to preserve accuracy. Experiments on Law School and COMPAS datasets show competitive fairness-accuracy trade-offs relative to state-of-the-art post-processing baselines, with the added advantage of not requiring sensitive attributes during inference.

Abstract

This paper presents a post-processing algorithm for training fair neural network regression models that satisfy statistical parity, utilizing an explainable singular value decomposition (SVD) of the weight matrix. We propose a linear transformation of the weight matrix, whereby the singular values derived from the SVD of the transformed matrix directly correspond to the differences in the first and second moments of the output distributions across two groups. Consequently, we can convert the fairness constraints into constraints on the singular values. We analytically solve the problem of finding the optimal weights under these constraints. Experimental validation on various datasets demonstrates that our method achieves a similar or superior fairness-accuracy trade-off compared to the baselines without using the sensitive attribute at the inference time.

Post-processing for Fair Regression via Explainable SVD

TL;DR

The paper tackles fair regression under statistical parity by post-processing neural network weights using an Explainable SVD (ESVDFair). By constructing moment-based disparity bounds through transformations WS_e and WS_v, the authors convert fairness constraints into convex, tractable optimization problems with closed-form-like solutions for adjusted singular values. The method sequentially reduces first- and second-moment disparities via covariance-aware and moment-based constraints, followed by a last-layer least-squares refinement to preserve accuracy. Experiments on Law School and COMPAS datasets show competitive fairness-accuracy trade-offs relative to state-of-the-art post-processing baselines, with the added advantage of not requiring sensitive attributes during inference.

Abstract

This paper presents a post-processing algorithm for training fair neural network regression models that satisfy statistical parity, utilizing an explainable singular value decomposition (SVD) of the weight matrix. We propose a linear transformation of the weight matrix, whereby the singular values derived from the SVD of the transformed matrix directly correspond to the differences in the first and second moments of the output distributions across two groups. Consequently, we can convert the fairness constraints into constraints on the singular values. We analytically solve the problem of finding the optimal weights under these constraints. Experimental validation on various datasets demonstrates that our method achieves a similar or superior fairness-accuracy trade-off compared to the baselines without using the sensitive attribute at the inference time.

Paper Structure

This paper contains 27 sections, 8 theorems, 72 equations, 5 figures, 4 tables, 2 algorithms.

Key Result

Lemma 2.1

Assume for some $l$, $\mathscr{X}^{[l]}_{a}$ follows a Multivariate normal distribution for $a\in \{1,2\}$. If $\mathscr{Z}_{1}^{[l]}$ and $\mathscr{Z}_{2}^{[l]}$ have the same mean value and covariance matrix, then $\hat{\mathscr{Y}}$ is independent of $\mathscr{A}.$

Figures (5)

  • Figure 1: Density of the output distribution across two sensitive groups on the Law School Sucess dataset.
  • Figure 2: Density of the output distribution across two sensitive groups on the COMPAS dataset.
  • Figure 3: Density of the output distribution across two sensitive groups on the Law School Success dataset with different $\tilde{c}_{e}$.
  • Figure 4: KS vs. MSE with different $\tilde{c}_{e}$
  • Figure 5: Density of the output distribution across two sensitive groups on Law School Sucess dataset with different $\tilde{c}_{v}$.

Theorems & Definitions (12)

  • Lemma 2.1
  • Lemma 3.1
  • Theorem 3.1
  • Corollary 3.1
  • Theorem 3.2
  • Theorem 4.1
  • Theorem 4.2
  • proof
  • proof
  • Lemma A.1
  • ...and 2 more