Feature Importance Disparities for Data Bias Investigations

Peter W. Chang; Leor Fishman; Seth Neel

Feature Importance Disparities for Data Bias Investigations

Peter W. Chang, Leor Fishman, Seth Neel

TL;DR

This work introduces feature importance disparity ($ ext{FID}$) as a data-centric diagnostic for data bias investigations, assessing how a feature's influence differs between a subgroup and the overall population. By formalizing $ ext{FID}$ with separable feature importance notions and proposing an oracle-efficient optimization using a Cost-Sensitive Classification (CSC) oracle, the authors efficiently identify high-$ ext{AVG-FID}$ subgroups even in exponentially large subgroup spaces. Empirically across four datasets and multiple explanations (LIME, SHAP, GRAD, LIN-FID), large $ ext{AVG-FID}$ subgroups are found, often aligning with fairness-metric disparities and generalizing out-of-sample; rich subgroups frequently yield larger disparities than marginal ones. The method offers a practical toolkit for DBI, enabling targeted interventions such as subgroup-specific modeling or data-collection investigations, while acknowledging limitations in explanation stability and the need for predefined sensitive features. The work thus complements fairness research with a data-centric perspective on bias sources in tabular data.

Abstract

It is widely held that one cause of downstream bias in classifiers is bias present in the training data. Rectifying such biases may involve context-dependent interventions such as training separate models on subgroups, removing features with bias in the collection process, or even conducting real-world experiments to ascertain sources of bias. Despite the need for such data bias investigations, few automated methods exist to assist practitioners in these efforts. In this paper, we present one such method that given a dataset $X$ consisting of protected and unprotected features, outcomes $y$, and a regressor $h$ that predicts $y$ given $X$, outputs a tuple $(f_j, g)$, with the following property: $g$ corresponds to a subset of the training dataset $(X, y)$, such that the $j^{th}$ feature $f_j$ has much larger (or smaller) influence in the subgroup $g$, than on the dataset overall, which we call feature importance disparity (FID). We show across $4$ datasets and $4$ common feature importance methods of broad interest to the machine learning community that we can efficiently find subgroups with large FID values even over exponentially large subgroup classes and in practice these groups correspond to subgroups with potentially serious bias issues as measured by standard fairness metrics.

Feature Importance Disparities for Data Bias Investigations

TL;DR

This work introduces feature importance disparity (

) as a data-centric diagnostic for data bias investigations, assessing how a feature's influence differs between a subgroup and the overall population. By formalizing

with separable feature importance notions and proposing an oracle-efficient optimization using a Cost-Sensitive Classification (CSC) oracle, the authors efficiently identify high-

subgroups even in exponentially large subgroup spaces. Empirically across four datasets and multiple explanations (LIME, SHAP, GRAD, LIN-FID), large

subgroups are found, often aligning with fairness-metric disparities and generalizing out-of-sample; rich subgroups frequently yield larger disparities than marginal ones. The method offers a practical toolkit for DBI, enabling targeted interventions such as subgroup-specific modeling or data-collection investigations, while acknowledging limitations in explanation stability and the need for predefined sensitive features. The work thus complements fairness research with a data-centric perspective on bias sources in tabular data.

Abstract

consisting of protected and unprotected features, outcomes

, and a regressor

that predicts

given

, outputs a tuple

, with the following property:

corresponds to a subset of the training dataset

, such that the

feature

has much larger (or smaller) influence in the subgroup

, than on the dataset overall, which we call feature importance disparity (FID). We show across

datasets and

common feature importance methods of broad interest to the machine learning community that we can efficiently find subgroups with large FID values even over exponentially large subgroup classes and in practice these groups correspond to subgroups with potentially serious bias issues as measured by standard fairness metrics.

Paper Structure (35 sections, 7 theorems, 23 equations, 14 figures, 7 tables, 2 algorithms)

This paper contains 35 sections, 7 theorems, 23 equations, 14 figures, 7 tables, 2 algorithms.

Introduction
Case Study: Data Bias Investigation on COMPAS
Results
Related Work
Preliminaries
Optimizing for $\texttt{AVG-FID}$
Experiments
Experimental Details
Experimental Results
Discussion of High $\texttt{FID}$ Subgroups
Comparison of $\texttt{FID}$ Values on Rich vs. Marginal Subgroups
Fairness Metrics
Discussion
Limitations
Reproducibility
...and 20 more sections

Key Result

Theorem 4.1

Let $F$ be a separable $\texttt{FID}$ notion, fix a classifier $h$, subgroup class $\mathcal{G}$, and oracle $\text{CSC}_{\mathcal{G}}$. Then choosing accuracy constant $\nu$ and bound constant $B$ and fixing a feature of interest $f_j$, we will run Algorithm alg:cap twice; once with $\texttt{FID}$

Figures (14)

Figure 1: Exploring a high $\texttt{FID}$ subgroup/feature pair for COMPAS. The first graph compares the average SHAP feature importance for priors-count in the subgroup vs. the dataset as a whole. The second graph shows the $5$ largest coefficients of the linear function of sensitive attributes that define the subgroup.
Figure 2: Summary of the highest $\texttt{FID}$s found for each (dataset, method). This is displayed as $| log_{10}(R) |$ where $R$ is the ratio of average importance per data point in $g^*$ to the average importance on $X$ for separable notions, or the ratio of coefficients for $\texttt{LIN-FID}$. This scale allows comparison across different importance notions. The feature associated with each $g^*$ is written above the bar.
Figure 3: Distribution of $\texttt{AVG-FID}$ on the top features from the BANK dataset using LIME. We see a sharp drop off in $\texttt{AVG-FID}$. This pattern is seen in all datasets and notions.
Figure 4: Exploration of key subgroup/feature pairs found for each dataset. The first graph shows the change in feature importance from whole dataset to subgroup. The second graph shows the main coefficients that define the subgroup.
Figure 5: Comparisons of some maximal $\texttt{FID}$ rich subgroups to the maximal $\texttt{FID}$ marginal subgroup on the same feature using the same log-scale as in Figure \ref{['fig:all_de']}. The feature associated with the subgroups is written above each bar.
...and 9 more figures

Theorems & Definitions (17)

Definition 3.1
Definition 3.2
Definition 3.3
Theorem 4.1
Lemma 4.1
proof
proof
Lemma 4.2: freund
Lemma 4.3: msr
Definition 6.1
...and 7 more

Feature Importance Disparities for Data Bias Investigations

TL;DR

Abstract

Feature Importance Disparities for Data Bias Investigations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (17)