Table of Contents
Fetching ...

FGAD: Self-boosted Knowledge Distillation for An Effective Federated Graph Anomaly Detection Framework

Jinyu Cai, Yunhe Zhang, Zhoumin Lu, Wenzhong Guo, See-kiong Ng

TL;DR

This work introduces an anomaly generator to perturb the normal graphs to be anomalous, and trains a powerful anomaly detector by distinguishing generated anomalous graphs from normal ones, and designs an effective collaborative learning mechanism that facilitates the personalization preservation of local models and significantly reduces communication costs among clients.

Abstract

Graph anomaly detection (GAD) aims to identify anomalous graphs that significantly deviate from other ones, which has raised growing attention due to the broad existence and complexity of graph-structured data in many real-world scenarios. However, existing GAD methods usually execute with centralized training, which may lead to privacy leakage risk in some sensitive cases, thereby impeding collaboration among organizations seeking to collectively develop robust GAD models. Although federated learning offers a promising solution, the prevalent non-IID problems and high communication costs present significant challenges, particularly pronounced in collaborations with graph data distributed among different participants. To tackle these challenges, we propose an effective federated graph anomaly detection framework (FGAD). We first introduce an anomaly generator to perturb the normal graphs to be anomalous, and train a powerful anomaly detector by distinguishing generated anomalous graphs from normal ones. Then, we leverage a student model to distill knowledge from the trained anomaly detector (teacher model), which aims to maintain the personality of local models and alleviate the adverse impact of non-IID problems. Moreover, we design an effective collaborative learning mechanism that facilitates the personalization preservation of local models and significantly reduces communication costs among clients. Empirical results of the GAD tasks on non-IID graphs compared with state-of-the-art baselines demonstrate the superiority and efficiency of the proposed FGAD method.

FGAD: Self-boosted Knowledge Distillation for An Effective Federated Graph Anomaly Detection Framework

TL;DR

This work introduces an anomaly generator to perturb the normal graphs to be anomalous, and trains a powerful anomaly detector by distinguishing generated anomalous graphs from normal ones, and designs an effective collaborative learning mechanism that facilitates the personalization preservation of local models and significantly reduces communication costs among clients.

Abstract

Graph anomaly detection (GAD) aims to identify anomalous graphs that significantly deviate from other ones, which has raised growing attention due to the broad existence and complexity of graph-structured data in many real-world scenarios. However, existing GAD methods usually execute with centralized training, which may lead to privacy leakage risk in some sensitive cases, thereby impeding collaboration among organizations seeking to collectively develop robust GAD models. Although federated learning offers a promising solution, the prevalent non-IID problems and high communication costs present significant challenges, particularly pronounced in collaborations with graph data distributed among different participants. To tackle these challenges, we propose an effective federated graph anomaly detection framework (FGAD). We first introduce an anomaly generator to perturb the normal graphs to be anomalous, and train a powerful anomaly detector by distinguishing generated anomalous graphs from normal ones. Then, we leverage a student model to distill knowledge from the trained anomaly detector (teacher model), which aims to maintain the personality of local models and alleviate the adverse impact of non-IID problems. Moreover, we design an effective collaborative learning mechanism that facilitates the personalization preservation of local models and significantly reduces communication costs among clients. Empirical results of the GAD tasks on non-IID graphs compared with state-of-the-art baselines demonstrate the superiority and efficiency of the proposed FGAD method.
Paper Structure (32 sections, 15 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 32 sections, 15 equations, 8 figures, 5 tables, 1 algorithm.

Figures (8)

  • Figure 1: Overview of the centralized learning and federated learning frameworks.
  • Figure 2: Overview of the FGAD framework. Note that the teacher model utilizes both normal and generated anomalous graphs for training an anomaly detector, while the student model only inputs normal graphs for the distillation of normal patterns.
  • Figure 3: Embedding visualization of the proposed FGAD compared with several baselines using t-SNE. Note that the data point marked in yellow, red, and green correspond to the normal graph (test), anomalous graph, and normal graph (train), respectively.
  • Figure 4: Parameter analysis of $\lambda$ and $\gamma$ on IMDB-BINARY and MOLECULES. Note that the values of $\lambda$ and $\gamma$ range from $[1e^{-4}, \dots, 1e^{3}]$.
  • Figure 5: Average performance and distribution of variance between clients of FedAvg and FGAD. Note that the client number is set to $[2,\dots, 10]$.
  • ...and 3 more figures