Table of Contents
Fetching ...

Graph Neural Backdoor: Fundamentals, Methodologies, Applications, and Future Directions

Xiao Yang, Gaolei Li, Jianhua Li

TL;DR

This survey addresses the vulnerability of Graph Neural Networks to backdoor attacks, especially in settings where training is outsourced or models are sourced from untrusted providers. It provides a structured taxonomy of backdoor attacks and defenses in graph learning, detailing data-poisoning frameworks, trigger designs, and activation mechanisms, along with practical defenses such as detection, filtration, and mitigation. The paper also discusses diverse application scenarios and challenges, including IP protection and machine unlearning verification, and outlines rich future directions—from semantic and black-box backdoors to generative, few-shot, and large-model extensions. Overall, the work aims to equip defenders with principled insights and to guide future research toward more secure, trustworthy graph-based AI systems.

Abstract

Graph Neural Networks (GNNs) have significantly advanced various downstream graph-relevant tasks, encompassing recommender systems, molecular structure prediction, social media analysis, etc. Despite the boosts of GNN, recent research has empirically demonstrated its potential vulnerability to backdoor attacks, wherein adversaries employ triggers to poison input samples, inducing GNN to adversary-premeditated malicious outputs. This is typically due to the controlled training process, or the deployment of untrusted models, such as delegating model training to third-party service, leveraging external training sets, and employing pre-trained models from online sources. Although there's an ongoing increase in research on GNN backdoors, comprehensive investigation into this field is lacking. To bridge this gap, we propose the first survey dedicated to GNN backdoors. We begin by outlining the fundamental definition of GNN, followed by the detailed summarization and categorization of current GNN backdoor attacks and defenses based on their technical characteristics and application scenarios. Subsequently, the analysis of the applicability and use cases of GNN backdoors is undertaken. Finally, the exploration of potential research directions of GNN backdoors is presented. This survey aims to explore the principles of graph backdoors, provide insights to defenders, and promote future security research.

Graph Neural Backdoor: Fundamentals, Methodologies, Applications, and Future Directions

TL;DR

This survey addresses the vulnerability of Graph Neural Networks to backdoor attacks, especially in settings where training is outsourced or models are sourced from untrusted providers. It provides a structured taxonomy of backdoor attacks and defenses in graph learning, detailing data-poisoning frameworks, trigger designs, and activation mechanisms, along with practical defenses such as detection, filtration, and mitigation. The paper also discusses diverse application scenarios and challenges, including IP protection and machine unlearning verification, and outlines rich future directions—from semantic and black-box backdoors to generative, few-shot, and large-model extensions. Overall, the work aims to equip defenders with principled insights and to guide future research toward more secure, trustworthy graph-based AI systems.

Abstract

Graph Neural Networks (GNNs) have significantly advanced various downstream graph-relevant tasks, encompassing recommender systems, molecular structure prediction, social media analysis, etc. Despite the boosts of GNN, recent research has empirically demonstrated its potential vulnerability to backdoor attacks, wherein adversaries employ triggers to poison input samples, inducing GNN to adversary-premeditated malicious outputs. This is typically due to the controlled training process, or the deployment of untrusted models, such as delegating model training to third-party service, leveraging external training sets, and employing pre-trained models from online sources. Although there's an ongoing increase in research on GNN backdoors, comprehensive investigation into this field is lacking. To bridge this gap, we propose the first survey dedicated to GNN backdoors. We begin by outlining the fundamental definition of GNN, followed by the detailed summarization and categorization of current GNN backdoor attacks and defenses based on their technical characteristics and application scenarios. Subsequently, the analysis of the applicability and use cases of GNN backdoors is undertaken. Finally, the exploration of potential research directions of GNN backdoors is presented. This survey aims to explore the principles of graph backdoors, provide insights to defenders, and promote future security research.
Paper Structure (33 sections, 8 equations, 3 figures, 3 tables)

This paper contains 33 sections, 8 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Illustration of GNN backdoor attack. The adversary poisons the training data of learners by embedding a specially-designed trigger to prompt the trained (backdoored) model. The attack enforces the backdoored model to predict the poisoned input as the target result.
  • Figure 2: Illustration of the general process of GNN backdoor, which is achieved by data-poisoning. Specifically, the adversary opts for a subset of samples from the training data and inserts designated subgraphs as triggers into them, subsequently modifying the ground truths of these data to the target class. As a consequence, this causes the trained model to predict the target class for input samples embedded with these triggers.
  • Figure 3: Illustration of general GNN framework. It operates by updating node representations through information aggregation, utilizing these learned representations to solve downstream problems related to graph-structured data: graph classification, node classification, and link classification.