FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently

Zicun Cong; Shi Baoxu; Shan Li; Jaewon Yang; Qi He; Jian Pei

FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently

Zicun Cong, Shi Baoxu, Shan Li, Jaewon Yang, Qi He, Jian Pei

TL;DR

This paper presents an in-depth analysis on how graph structure bias, node attribute bias, and model parameters may affect the demographic parity of GCNs and develops FairSample, a framework that jointly mitigates the three types of biases.

Abstract

Fairness in Graph Convolutional Neural Networks (GCNs) becomes a more and more important concern as GCNs are adopted in many crucial applications. Societal biases against sensitive groups may exist in many real world graphs. GCNs trained on those graphs may be vulnerable to being affected by such biases. In this paper, we adopt the well-known fairness notion of demographic parity and tackle the challenge of training fair and accurate GCNs efficiently. We present an in-depth analysis on how graph structure bias, node attribute bias, and model parameters may affect the demographic parity of GCNs. Our insights lead to FairSample, a framework that jointly mitigates the three types of biases. We employ two intuitive strategies to rectify graph structures. First, we inject edges across nodes that are in different sensitive groups but similar in node features. Second, to enhance model fairness and retain model quality, we develop a learnable neighbor sampling policy using reinforcement learning. To address the bias in node features and model parameters, FairSample is complemented by a regularization objective to optimize fairness.

FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently

TL;DR

Abstract

Paper Structure (23 sections, 1 theorem, 5 equations, 6 figures, 6 tables)

This paper contains 23 sections, 1 theorem, 5 equations, 6 figures, 6 tables.

Introduction
Related Work
Fair Graph Convolutional Neural Networks
Sampling Strategies for Efficient GCN Training
Preliminaries
Graph Convolutional Neural Networks
Fairness Notion of Demographic Parity
Intuition and Problem Formulation
FairSample: Sampling for Fair GCNs
The Framework of FairSample
Edge Injector
Computation Graph Sampler
Empirical Study
Datasets and Experiment Settings
Dataset
...and 8 more sections

Key Result

Theorem 1

Denote by $\mathbf{\mu}_{a} = \frac{1}{|\mathcal{V}_{a}|}\sum_{v \in \mathcal{V}_{a}}\mathbf{x}_v$ the mean of the node feature vectors in a group $\mathcal{V}_{a}$ and by $dev(\mathcal{V}_a) = \max_{v \in \mathcal{V}_a}\{\| v - \mu_{a}\|_{\infty}\}$ the deviation. Let $\delta_a=\max\{dev(\mathcal{V

Figures (6)

Figure 1: (a) An input graph to a 2-layer GCN. (b) The original computation graph of node $v_1$ in the 2-layer GCN. (c) The down-sampled computation graph of node $v_1$ in the 2-layer GCN. The embedding of a node in a layer is plotted on the left of the node. The arrows indicate the directions of embedding aggregation.
Figure 2: An example showing the intuition of the FairSample approach. (a) An input graph to FairSample. (b) The augmented graph after injecting an inter-group edge. (c) Jointly train the sampling policy $f_{S}$ and the 2-layer GCN node classifier $f_{G}$ with the computation graph of a node $v_1$.
Figure 3: The demographic parity and accuracy tradeoff of the models trained by FairSample and GSR. A point closer to the top-right corner is better.
Figure 4: The accuracy and $\Delta DP$ convergence curves of FairSample, FGAT, NIFTY, GSR, and PASSR across training epochs. For the sake of display clarity, we report the results of PASSR and GSR separately from the other baselines in Figures \ref{['figure:acc_convergence_pokec_z_pass']} and \ref{['figure:dp_convergence_pokec_z_pass']}.
Figure 5: The training time (in second) and GPU memory usage (in MB) of FairSample and the baselines on the PZG dataset. The bars in the figures are sorted in ascending order from left to right based on their values. The y-axis is in logarithmic scale.
...and 1 more figures

Theorems & Definitions (7)

Example 1: GCN Classifier and Computation Graph
Definition 1: Demographic Parity DBLP:conf/icml/AgarwalBD0W18
Theorem 1
proof
Example 2: Phase 1
Example 3: Phase 2
Example 4

FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently

TL;DR

Abstract

FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (7)