Table of Contents
Fetching ...

Practical Guide for Causal Pathways and Sub-group Disparity Analysis

Farnaz Kohankhaki, Shaina Raza, Oluwanifemi Bamgbose, Deval Pandya, Elham Dolatabadi

TL;DR

The paper tackles the problem of quantifying how sensitive attributes causally influence outcomes in observational data to support fairness audits. It introduces a causal disparity analysis framework that combines counterfactual inference, causal decomposition into direct, indirect, and spurious effects, and the use of Total Variation, together with Generalized Random Forest–based sub-group discovery. Across two real-world datasets, Adult and HDMA, it shows that subgroups with larger direct causal effects often exhibit greater ML fairness gaps, even when observed disparities are small, underscoring the value of causal reasoning in bias audits. The work discusses limitations and outlines future directions, including intersectional analyses and exploring additional causal discovery methods to strengthen fairness assessments in AI systems.

Abstract

In this study, we introduce the application of causal disparity analysis to unveil intricate relationships and causal pathways between sensitive attributes and the targeted outcomes within real-world observational data. Our methodology involves employing causal decomposition analysis to quantify and examine the causal interplay between sensitive attributes and outcomes. We also emphasize the significance of integrating heterogeneity assessment in causal disparity analysis to gain deeper insights into the impact of sensitive attributes within specific sub-groups on outcomes. Our two-step investigation focuses on datasets where race serves as the sensitive attribute. The results on two datasets indicate the benefit of leveraging causal analysis and heterogeneity assessment not only for quantifying biases in the data but also for disentangling their influences on outcomes. We demonstrate that the sub-groups identified by our approach to be affected the most by disparities are the ones with the largest ML classification errors. We also show that grouping the data only based on a sensitive attribute is not enough, and through these analyses, we can find sub-groups that are directly affected by disparities. We hope that our findings will encourage the adoption of such methodologies in future ethical AI practices and bias audits, fostering a more equitable and fair technological landscape.

Practical Guide for Causal Pathways and Sub-group Disparity Analysis

TL;DR

The paper tackles the problem of quantifying how sensitive attributes causally influence outcomes in observational data to support fairness audits. It introduces a causal disparity analysis framework that combines counterfactual inference, causal decomposition into direct, indirect, and spurious effects, and the use of Total Variation, together with Generalized Random Forest–based sub-group discovery. Across two real-world datasets, Adult and HDMA, it shows that subgroups with larger direct causal effects often exhibit greater ML fairness gaps, even when observed disparities are small, underscoring the value of causal reasoning in bias audits. The work discusses limitations and outlines future directions, including intersectional analyses and exploring additional causal discovery methods to strengthen fairness assessments in AI systems.

Abstract

In this study, we introduce the application of causal disparity analysis to unveil intricate relationships and causal pathways between sensitive attributes and the targeted outcomes within real-world observational data. Our methodology involves employing causal decomposition analysis to quantify and examine the causal interplay between sensitive attributes and outcomes. We also emphasize the significance of integrating heterogeneity assessment in causal disparity analysis to gain deeper insights into the impact of sensitive attributes within specific sub-groups on outcomes. Our two-step investigation focuses on datasets where race serves as the sensitive attribute. The results on two datasets indicate the benefit of leveraging causal analysis and heterogeneity assessment not only for quantifying biases in the data but also for disentangling their influences on outcomes. We demonstrate that the sub-groups identified by our approach to be affected the most by disparities are the ones with the largest ML classification errors. We also show that grouping the data only based on a sensitive attribute is not enough, and through these analyses, we can find sub-groups that are directly affected by disparities. We hope that our findings will encourage the adoption of such methodologies in future ethical AI practices and bias audits, fostering a more equitable and fair technological landscape.
Paper Structure (15 sections, 5 equations, 6 figures, 5 tables)

This paper contains 15 sections, 5 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The steps involved in our approach to achieving fairness in ML classification models through causal pathway decomposition and sub-group analysis.
  • Figure 2: Variable importance for top 5 attributes of each experiment.
  • Figure 3: Summary of sub-group analysis for the continuous Variables: (a) and (b) represent Age and Hours per Week in the Adult dataset; (c) and (d) represent Application Income and Loan Amount in the HDMA-White dataset; and (e) and (f) represent Application Income and Loan Amount in the HDMA-Asian dataset. Sub-group 1 represents ctf-DE values less than $-0.01$, Sub-group 2 represents ctf-DE values between $-0.01$ and $0.01$ (around zero effects), Sub-group 3 represents ctf-DE values between $0.01$ and $0.05$, and Sub-group 4 represents ctf-DE values greater than $0.05$.
  • Figure 4: Difference in performance metrics including accuracy, precision, and recall between sensitive groups.
  • Figure 5: Histogram of ctf-DE values
  • ...and 1 more figures