Testing for Causal Fairness
Jiarun Fu, LiZhong Ding, Pengqi Li, Qiuning Wei, Yurong Cheng, Xu Chen
TL;DR
The paper tackles the inadequacy of data-based potential outcomes in high-dimensional fairness analysis by introducing a distribution-based POF and reframing fairness as Distributional Closeness Testing. It defines the counterfactual closeness fairness criterion and develops the NAMMD Treatment Effect (N-TE) statistic, underpinning the Counterfactual Fairness-Closeness Testing (CF-CLOT) procedure with bootstrap-based inference and an asymptotic Gaussian guarantee. By tuning the closeness parameter $\epsilon$, CF-CLOT offers strong, neutral, or weak sensitivity to distributional differences between factual and counterfactual outcomes, enabling nuanced fairness assessment. The approach is validated across real-world tabular datasets and high-dimensional tasks (FER and CRA), showing effective identification of unfair sensitive attributes and competitive performance against SCM-based methods. Overall, CF-CLOT provides a theoretically sound, practically flexible framework for trustworthy fairness testing in complex, high-dimensional settings.
Abstract
Causality is widely used in fairness analysis to prevent discrimination on sensitive attributes, such as genders in career recruitment and races in crime prediction. However, the current data-based Potential Outcomes Framework (POF) often leads to untrustworthy fairness analysis results when handling high-dimensional data. To address this, we introduce a distribution-based POF that transform fairness analysis into Distributional Closeness Testing (DCT) by intervening on sensitive attributes. We define counterfactual closeness fairness as the null hypothesis of DCT, where a sensitive attribute is considered fair if its factual and counterfactual potential outcome distributions are sufficiently close. We introduce the Norm-Adaptive Maximum Mean Discrepancy Treatment Effect (N-TE) as a statistic for measuring distributional closeness and apply DCT using the empirical estimator of NTE, referred to Counterfactual Fairness-CLOseness Testing ($\textrm{CF-CLOT}$). To ensure the trustworthiness of testing results, we establish the testing consistency of N-TE through rigorous theoretical analysis. $\textrm{CF-CLOT}$ demonstrates sensitivity in fairness analysis through the flexibility of the closeness parameter $ε$. Unfair sensitive attributes have been successfully tested by $\textrm{CF-CLOT}$ in extensive experiments across various real-world scenarios, which validate the consistency of the testing.
