Table of Contents
Fetching ...

A Proof System with Causal Labels (Part II): checking Counterfactual Fairness

Leonardo Ceragioli, Giuseppe Primiero

TL;DR

This paper addresses verifying counterfactual fairness for probabilistic classifiers by embedding causal interventions into the TNDPQ proof system. It formalizes data points and causal relations, introducing predicates for immediate and mediate causes and an intervention operator I(a:alpha), and extends sequents to hypersequents that express factual and counterfactual judgments. A set of rules, including C-Weakening and various v- and I-Cut variants, allows deriving counterfactual sequents and deciding CF by comparing the resulting probabilities p and q. The work demonstrates how to check whether a classifier's output remains invariant under interventions on protected attributes within the atomic fragment, providing a principled basis for assessing fairness in opaque ML systems. It offers a formal framework that can inform verification tools and audits of probabilistic classifiers against counterfactual fairness criteria.

Abstract

In this article we propose an extension to the typed natural deduction calculus TNDPQ to model verification of counterfactual fairness in probabilistic classifiers. This is obtained formulating specific structural conditions for causal labels and checking that evaluation is robust under their variation.

A Proof System with Causal Labels (Part II): checking Counterfactual Fairness

TL;DR

This paper addresses verifying counterfactual fairness for probabilistic classifiers by embedding causal interventions into the TNDPQ proof system. It formalizes data points and causal relations, introducing predicates for immediate and mediate causes and an intervention operator I(a:alpha), and extends sequents to hypersequents that express factual and counterfactual judgments. A set of rules, including C-Weakening and various v- and I-Cut variants, allows deriving counterfactual sequents and deciding CF by comparing the resulting probabilities p and q. The work demonstrates how to check whether a classifier's output remains invariant under interventions on protected attributes within the atomic fragment, providing a principled basis for assessing fairness in opaque ML systems. It offers a formal framework that can inform verification tools and audits of probabilistic classifiers against counterfactual fairness criteria.

Abstract

In this article we propose an extension to the typed natural deduction calculus TNDPQ to model verification of counterfactual fairness in probabilistic classifiers. This is obtained formulating specific structural conditions for causal labels and checking that evaluation is robust under their variation.

Paper Structure

This paper contains 5 sections, 14 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Graphs corresponding to factual and counterfactual Data Points. The squares in figure (b) surround variables that depend on the variable we intervene on, and which for this reason cannot be used to decide the target.

Theorems & Definitions (4)

  • Definition 2.1: Counterfactual Fairness (CF)
  • Definition 3.1: Causal Graph
  • Example 1: Factual and Counterfactual sequents
  • Example 2: Evaluation of CF for a classifier