Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation

Neng Kai Nigel Neo; Yeon-Chang Lee; Yiqiao Jin; Sang-Wook Kim; Srijan Kumar

Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation

Neng Kai Nigel Neo, Yeon-Chang Lee, Yiqiao Jin, Sang-Wook Kim, Srijan Kumar

TL;DR

The paper defines FairGAD as detecting graph anomalies while avoiding discrimination against sensitive groups, formalizing with $Y\in\{0,1\}^n$ and $S\in\{0,1\}^n$ and evaluating with $SP$ and $EOO$; it contributes two real-world datasets from Reddit and Twitter with 1.2M and 0.4M edges and 9k and 47k nodes, linking anomaly labels (misinformation spreaders) to political leaning. An extensive empirical study compares nine GAD and non-graph AD methods, plus five fairness techniques, revealing that current methods do not consistently improve the accuracy–fairness trade-off and can even increase unfairness under certain settings. The work provides datasets and code to spur FairGAD research while highlighting practical and ethical considerations, and it calls for new methods that can better balance detection performance with fairness in real-world social graphs.

Abstract

The Fair Graph Anomaly Detection (FairGAD) problem aims to accurately detect anomalous nodes in an input graph while avoiding biased predictions against individuals from sensitive subgroups. However, the current literature does not comprehensively discuss this problem, nor does it provide realistic datasets that encompass actual graph structures, anomaly labels, and sensitive attributes. To bridge this gap, we introduce a formal definition of the FairGAD problem and present two novel datasets constructed from the social media platforms Reddit and Twitter. These datasets comprise 1.2 million and 400,000 edges associated with 9,000 and 47,000 nodes, respectively, and leverage political leanings as sensitive attributes and misinformation spreaders as anomaly labels. We demonstrate that our FairGAD datasets significantly differ from the synthetic datasets used by the research community. Using our datasets, we investigate the performance-fairness trade-off in nine existing GAD and non-graph AD methods on five state-of-the-art fairness methods. Our code and datasets are available at https://github.com/nigelnnk/FairGAD

Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation

TL;DR

The paper defines FairGAD as detecting graph anomalies while avoiding discrimination against sensitive groups, formalizing with

and

and evaluating with

and

; it contributes two real-world datasets from Reddit and Twitter with 1.2M and 0.4M edges and 9k and 47k nodes, linking anomaly labels (misinformation spreaders) to political leaning. An extensive empirical study compares nine GAD and non-graph AD methods, plus five fairness techniques, revealing that current methods do not consistently improve the accuracy–fairness trade-off and can even increase unfairness under certain settings. The work provides datasets and code to spur FairGAD research while highlighting practical and ethical considerations, and it calls for new methods that can better balance detection performance with fairness in real-world social graphs.

Abstract

Paper Structure (16 sections, 7 equations, 3 figures, 5 tables)

This paper contains 16 sections, 7 equations, 3 figures, 5 tables.

Introduction
The Proposed Problem: FairGAD
Data Description
Collection Procedure
Dataset Statistics
Why Do Our Datasets Matter and Suit the FairGAD Problem?
Evaluation
Experimental Settings
Accuracy vs. Fairness
Sensitive Attribute Analysis
Ethics Statement
Conclusion
List of Politics Related Subreddits
Results on Additional Baselines
Further Details on Fairness Regularizers
...and 1 more sections

Figures (3)

Figure 1: Changes in AUCROC and EOO for different values of $\lambda$ (HIN or FairOD factor) and $\gamma$ (ADCG factor) for CONAD method with HIN and FairOD regularizers on Reddit.
Figure 2: Changes in AUCROC and EOO for different values of $\lambda$ (Correlation factor) for CONAD, DOMINANT, and VGOD methods with Correlation regularizer on Reddit.
Figure 3: Trade-off spaces for GAD methods with fairness regularizers. The ideal FairGAD method should have low EOO and high AUCROC (i.e., the bottom right corner).

Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation

TL;DR

Abstract

Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)