Fair Feature Importance Scores via Feature Occlusion and Permutation

Camille Little; Madeline Navarro; Santiago Segarra; Genevera Allen

Fair Feature Importance Scores via Feature Occlusion and Permutation

Camille Little, Madeline Navarro, Santiago Segarra, Genevera Allen

TL;DR

The paper addresses the need for interpretable assessment of how individual features influence model fairness, beyond traditional accuracy-focused feature importance. It introduces two model-agnostic fair feature-importance scores: a permutation-based score $\rho_{\rm perm}(j)$ and an occlusion-based score $\rho_{\rm occl}(j)$, with the latter efficiently computed via minipatch learning. The authors provide formal definitions, discuss trade-offs, and validate the methods through synthetic simulations and case studies (Adult Income and German Credit) using a Random Forest and demographic parity as the fairness metric. The results demonstrate that the scores identify biased vs. informative features and reveal fairness-accuracy dynamics, offering scalable tools for auditing and improving fairness in diverse tasks. Overall, the work delivers practical, interpretable metrics that can guide feature selection and fairness auditing across models and fairness definitions.

Abstract

As machine learning models increasingly impact society, their opaque nature poses challenges to trust and accountability, particularly in fairness contexts. Understanding how individual features influence model outcomes is crucial for building interpretable and equitable models. While feature importance metrics for accuracy are well-established, methods for assessing feature contributions to fairness remain underexplored. We propose two model-agnostic approaches to measure fair feature importance. First, we propose to compare model fairness before and after permuting feature values. This simple intervention-based approach decouples a feature and model predictions to measure its contribution to training. Second, we evaluate the fairness of models trained with and without a given feature. This occlusion-based score enjoys dramatic computational simplification via minipatch learning. Our empirical results reflect the simplicity and effectiveness of our proposed metrics for multiple predictive tasks. Both methods offer simple, scalable, and interpretable solutions to quantify the influence of features on fairness, providing new tools for responsible machine learning development.

Fair Feature Importance Scores via Feature Occlusion and Permutation

TL;DR

and an occlusion-based score

, with the latter efficiently computed via minipatch learning. The authors provide formal definitions, discuss trade-offs, and validate the methods through synthetic simulations and case studies (Adult Income and German Credit) using a Random Forest and demographic parity as the fairness metric. The results demonstrate that the scores identify biased vs. informative features and reveal fairness-accuracy dynamics, offering scalable tools for auditing and improving fairness in diverse tasks. Overall, the work delivers practical, interpretable metrics that can guide feature selection and fairness auditing across models and fairness definitions.

Abstract

Paper Structure (8 sections, 3 equations, 2 figures)

This paper contains 8 sections, 3 equations, 2 figures.

Introduction
Fair Feature Importance Scores
Importance Scores via Permutation
Importance Scores via Occlusion
Empirical Studies
Simulation setup and Results
Case Studies
Discussion and Future Work

Figures (2)

Figure 1: Classification and regression results are presented for accuracy, measured by classification error and MSE, and fairness, measured by demographic parity Feldman:2015, using a Random Forest model. Feature importance scores were computed using occlusion ${ \rho_{\rm occl} }$ and permutation ${ \rho_{\rm perm} }$ metrics for both simulation types. In these simulations, the first five features are signal features associated with the outcome, while the first two are correlated with the protected attribute, introducing bias. Positive scores indicate that a feature improves fairness or accuracy, while negative scores suggest the opposite. The magnitudes and directions of the scores align as expected, consistent with the simulation design.
Figure 2: Random Forest interpretation using occlusion scores via minipatches for the Adult Income Dataset and German Credit dataset. The importance scores' magnitudes and directions align with other studies done on these datasets.

Fair Feature Importance Scores via Feature Occlusion and Permutation

TL;DR

Abstract

Fair Feature Importance Scores via Feature Occlusion and Permutation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)