Table of Contents
Fetching ...

FairQuant: Certifying and Quantifying Fairness of Deep Neural Networks

Brian Hyeongseok Kim, Jingbo Wang, Chao Wang

TL;DR

FairQuant addresses the challenge of certifying and quantifying individual fairness for deep neural networks at scale. It combines abstract, symbolic interval forward analysis with iterative backward refinement to partition the input space and determine fairness, producing certified, falsified, or undecided partitions plus a lower-bound percentage of inputs that are provably fair. The method achieves superior accuracy and scalability compared with prior work and delivers a quantitative measure of fairness unavailable in existing approaches. This enables practitioners to certify fairness guarantees and reason about the proportion of inputs for which a model behaves fairly, with practical impact on safety-critical decision systems.

Abstract

We propose a method for formally certifying and quantifying individual fairness of deep neural networks (DNN). Individual fairness guarantees that any two individuals who are identical except for a legally protected attribute (e.g., gender or race) receive the same treatment. While there are existing techniques that provide such a guarantee, they tend to suffer from lack of scalability or accuracy as the size and input dimension of the DNN increase. Our method overcomes this limitation by applying abstraction to a symbolic interval based analysis of the DNN followed by iterative refinement guided by the fairness property. Furthermore, our method lifts the symbolic interval based analysis from conventional qualitative certification to quantitative certification, by computing the percentage of individuals whose classification outputs are provably fair, instead of merely deciding if the DNN is fair. We have implemented our method and evaluated it on deep neural networks trained on four popular fairness research datasets. The experimental results show that our method is not only more accurate than state-of-the-art techniques but also several orders-of-magnitude faster.

FairQuant: Certifying and Quantifying Fairness of Deep Neural Networks

TL;DR

FairQuant addresses the challenge of certifying and quantifying individual fairness for deep neural networks at scale. It combines abstract, symbolic interval forward analysis with iterative backward refinement to partition the input space and determine fairness, producing certified, falsified, or undecided partitions plus a lower-bound percentage of inputs that are provably fair. The method achieves superior accuracy and scalability compared with prior work and delivers a quantitative measure of fairness unavailable in existing approaches. This enables practitioners to certify fairness guarantees and reason about the proportion of inputs for which a model behaves fairly, with practical impact on safety-critical decision systems.

Abstract

We propose a method for formally certifying and quantifying individual fairness of deep neural networks (DNN). Individual fairness guarantees that any two individuals who are identical except for a legally protected attribute (e.g., gender or race) receive the same treatment. While there are existing techniques that provide such a guarantee, they tend to suffer from lack of scalability or accuracy as the size and input dimension of the DNN increase. Our method overcomes this limitation by applying abstraction to a symbolic interval based analysis of the DNN followed by iterative refinement guided by the fairness property. Furthermore, our method lifts the symbolic interval based analysis from conventional qualitative certification to quantitative certification, by computing the percentage of individuals whose classification outputs are provably fair, instead of merely deciding if the DNN is fair. We have implemented our method and evaluated it on deep neural networks trained on four popular fairness research datasets. The experimental results show that our method is not only more accurate than state-of-the-art techniques but also several orders-of-magnitude faster.
Paper Structure (34 sections, 2 theorems, 5 figures, 2 tables, 4 algorithms)

This paper contains 34 sections, 2 theorems, 5 figures, 2 tables, 4 algorithms.

Key Result

Theorem 1

When forward analysis declares an input partition $P\subseteq X$ as fair, the result is guaranteed to be sound in that $f(x)=f'(x)$ holds for all $x\in P$ and its counterpart $x'$, Similarly, when forward analysis declares $P$ as unfair, the result is guaranteed to be sound in that $f(x)\neq f'(x)$

Figures (5)

  • Figure 1: FairQuant for certifying and quantifying fairness of a DNN model $f$ where $x_j$ is a protected attribute and $X$ is the input domain.
  • Figure 2: Symbolic interval analysis of an example DNN for making hiring decisions: the left figure is for female applicants ($i_2\in[0,0]$), and the right figure is for male applicants (where $i_2\in[1,1]$). Except for the protected attribute $i_2$, the symbolic intervals of the other attributes are the same.
  • Figure 3: Iterative refinement tree for the example DNN in Fig. \ref{['fig:example.network']}, to increase the chance of certifying or falsifying the DNN within an input partition.
  • Figure 4: Sufficient conditions for deciding fairness based on symbolic output intervals $O$ and $O'$, and the threshold 0: there are two fair conditions (left) and two unfair conditions (right).
  • Figure 5: Comparing the runtime overhead (left) and accuracy (right) of Fairifybiswas_fairify_2023, in red, and FairQuant (new), in blue.

Theorems & Definitions (5)

  • Definition 1: Individual Fairness for a Given Input
  • Definition 2: Individual Fairness for the Input Domain
  • Definition 3: Robustness
  • Theorem 1
  • Theorem 2