Table of Contents
Fetching ...

FairPFN: Transformers Can do Counterfactual Fairness

Jake Robertson, Noah Hollmann, Noor Awad, Frank Hutter

TL;DR

The paper addresses bias in predictive systems by tackling counterfactual fairness without requiring a known causal graph. It introduces FairPFN, a transformer based on prior-fitted networks (PFNs) pretrained on synthetic causal data to remove the causal influence of protected attributes from observational data. Results show FairPFN achieves competitive or superior fairness-accuracy trade-offs across synthetic benchmarks and real-world datasets, including substantial reductions in $TCE$ while maintaining predictive performance, and competitive counterfactual MAE relative to baselines. This approach broadens the practical applicability of causal fairness by eliminating the need for exact causal models, enabling more robust deployment in domains with complex or unknown causal structures.

Abstract

Machine Learning systems are increasingly prevalent across healthcare, law enforcement, and finance but often operate on historical data, which may carry biases against certain demographic groups. Causal and counterfactual fairness provides an intuitive way to define fairness that closely aligns with legal standards. Despite its theoretical benefits, counterfactual fairness comes with several practical limitations, largely related to the reliance on domain knowledge and approximate causal discovery techniques in constructing a causal model. In this study, we take a fresh perspective on counterfactually fair prediction, building upon recent work in in context learning (ICL) and prior fitted networks (PFNs) to learn a transformer called FairPFN. This model is pretrained using synthetic fairness data to eliminate the causal effects of protected attributes directly from observational data, removing the requirement of access to the correct causal model in practice. In our experiments, we thoroughly assess the effectiveness of FairPFN in eliminating the causal impact of protected attributes on a series of synthetic case studies and real world datasets. Our findings pave the way for a new and promising research area: transformers for causal and counterfactual fairness.

FairPFN: Transformers Can do Counterfactual Fairness

TL;DR

The paper addresses bias in predictive systems by tackling counterfactual fairness without requiring a known causal graph. It introduces FairPFN, a transformer based on prior-fitted networks (PFNs) pretrained on synthetic causal data to remove the causal influence of protected attributes from observational data. Results show FairPFN achieves competitive or superior fairness-accuracy trade-offs across synthetic benchmarks and real-world datasets, including substantial reductions in while maintaining predictive performance, and competitive counterfactual MAE relative to baselines. This approach broadens the practical applicability of causal fairness by eliminating the need for exact causal models, enabling more robust deployment in domains with complex or unknown causal structures.

Abstract

Machine Learning systems are increasingly prevalent across healthcare, law enforcement, and finance but often operate on historical data, which may carry biases against certain demographic groups. Causal and counterfactual fairness provides an intuitive way to define fairness that closely aligns with legal standards. Despite its theoretical benefits, counterfactual fairness comes with several practical limitations, largely related to the reliance on domain knowledge and approximate causal discovery techniques in constructing a causal model. In this study, we take a fresh perspective on counterfactually fair prediction, building upon recent work in in context learning (ICL) and prior fitted networks (PFNs) to learn a transformer called FairPFN. This model is pretrained using synthetic fairness data to eliminate the causal effects of protected attributes directly from observational data, removing the requirement of access to the correct causal model in practice. In our experiments, we thoroughly assess the effectiveness of FairPFN in eliminating the causal impact of protected attributes on a series of synthetic case studies and real world datasets. Our findings pave the way for a new and promising research area: transformers for causal and counterfactual fairness.
Paper Structure (9 sections, 12 figures)

This paper contains 9 sections, 12 figures.

Figures (12)

  • Figure 1: FairPFN Pre-training: FairPFN is pre-trained on a synthetic prior of datasets generated from sparse SCMs with exogenous protected attributes. A biased dataset is generated and passed as context to the transformer, and the loss is calculated with respect to the fair outcomes calculated by removing the causal influence of the protected attribute.
  • Figure 2: Causal Case Studies: Visualization and data generating processes of synthetic causal case studies, a handcrafted set of benchmarks designed to evaluate FairPFN's ability to remove various sources of bias in causally generated data.
  • Figure 3: Real-World Datasets: Causal graphs of real-world datasets Law School Admissions and Adult Census Income.
  • Figure 4: Causal Effect Removal (Synthetic): Average causal effect (IE, DE, or TEE) and error (1-AUC) of FairPFN compared to our baselines. FairPFN is on the Pareto Front across all synthetic case studies, dominates EGR on 5 out of 6, and always improves upon CFP in terms of error.
  • Figure 5: Effect of Noise Terms (Synthetic): Causal Effect (TCE) and erorr (1-AUC) of FairPFN compared to the Unfair baseline on each individual dataset from our causal case studies. We provide a color gradient for both baselines (blue to green and red to yellow) to depict increasing amount of noise in the data. FairPFN consistently reduces the TCE on all benchmark groups, achieving lower error on datasets with larger amounts of noise.
  • ...and 7 more figures