Table of Contents
Fetching ...

TowerDebias: A Novel Unfairness Removal Method Based on the Tower Property

Norman Matloff, Aditya Mittal

TL;DR

The paper addresses the challenge of removing sensitive-attribute influence from predictions produced by black-box models without retraining. It introduces towerDebias (tDB), a post-processing method that leverages the Tower Property to estimate $E(Y|X)$ by averaging $E(Y|X,S)$ over $S$, with a $k$-nearest-neighbors extension when exact matches on $X$ are unavailable. A formal fairness-improvement theorem and a closed-form expression for correlation reduction under a trivariate normal model are provided, along with an $L_2$-space interpretation of the method. Empirical results across regression and classification tasks demonstrate meaningful reductions in the Pearson correlation between predictions and sensitive attributes, with modest accuracy trade-offs and favorable comparisons to FairML variants, highlighting tDB’s broad applicability to real-world black-box systems.

Abstract

Decision-making processes have increasingly come to rely on sophisticated machine learning tools, raising critical concerns about the fairness of their predictions with respect to sensitive groups. The widespread adoption of commercial "black-box" models necessitates careful consideration of their legal and ethical implications for consumers. When users interact with such black-box models, a key challenge arises: how can the influence of sensitive attributes, such as race or gender, be mitigated or removed from its predictions? We propose towerDebias (tDB), a novel post-processing method designed to reduce the influence of sensitive attributes in predictions made by black-box models. Our tDB approach leverages the Tower Property from probability theory to improve prediction fairness without requiring retraining of the original model. This method is highly versatile, as it requires no prior knowledge of the original algorithm's internal structure and is adaptable to a diverse range of applications. We present a formal fairness improvement theorem for tDB and showcase its effectiveness in both regression and classification tasks using multiple real-world datasets.

TowerDebias: A Novel Unfairness Removal Method Based on the Tower Property

TL;DR

The paper addresses the challenge of removing sensitive-attribute influence from predictions produced by black-box models without retraining. It introduces towerDebias (tDB), a post-processing method that leverages the Tower Property to estimate by averaging over , with a -nearest-neighbors extension when exact matches on are unavailable. A formal fairness-improvement theorem and a closed-form expression for correlation reduction under a trivariate normal model are provided, along with an -space interpretation of the method. Empirical results across regression and classification tasks demonstrate meaningful reductions in the Pearson correlation between predictions and sensitive attributes, with modest accuracy trade-offs and favorable comparisons to FairML variants, highlighting tDB’s broad applicability to real-world black-box systems.

Abstract

Decision-making processes have increasingly come to rely on sophisticated machine learning tools, raising critical concerns about the fairness of their predictions with respect to sensitive groups. The widespread adoption of commercial "black-box" models necessitates careful consideration of their legal and ethical implications for consumers. When users interact with such black-box models, a key challenge arises: how can the influence of sensitive attributes, such as race or gender, be mitigated or removed from its predictions? We propose towerDebias (tDB), a novel post-processing method designed to reduce the influence of sensitive attributes in predictions made by black-box models. Our tDB approach leverages the Tower Property from probability theory to improve prediction fairness without requiring retraining of the original model. This method is highly versatile, as it requires no prior knowledge of the original algorithm's internal structure and is adaptable to a diverse range of applications. We present a formal fairness improvement theorem for tDB and showcase its effectiveness in both regression and classification tasks using multiple real-world datasets.

Paper Structure

This paper contains 23 sections, 20 equations, 12 figures, 1 table.

Figures (12)

  • Figure 3: Effect of towerDebias on Misclassification rate increase on recidivism and the correlation reductions with race in the COMPAS dataset.
  • Figure : ML versus towerDebias: SVCensus Results
  • Figure : ML versus towerDebias: Law School Admission Results
  • Figure : ML versus towerDebias: COMPAS Results
  • Figure : ML versus towerDebias: IranianChurn Results
  • ...and 7 more figures