Table of Contents
Fetching ...

Backdoor Attacks on Discrete Graph Diffusion Models

Jiawen Wang, Samin Karim, Yuan Hong, Binghui Wang

TL;DR

We address the security risk of discrete graph diffusion models (DGDMs) by proposing the first backdoor attack against them, leveraging a subgraph trigger and a carefully designed forward diffusion to create a distinct backdoored limit distribution $({\bm{m}}_{X_B}, {\bm{m}}_{E_B})$ while preserving clean graph quality. The backdoored DiGress is shown to be permutation invariant and to generate exchangeable graphs, enabling robust stealth across node reorderings. Empirical results on QM9, MOSES, and GuacaMol demonstrate high attack success rates with minimal impact on validity/uniqueness, and transferability to DisCo confirms cross-DGDM vulnerability. The work highlights the need for defenses and motivates future directions toward provable guarantees and robust backdoor mitigation for graph diffusion-based generation.

Abstract

Diffusion models are powerful generative models in continuous data domains such as image and video data. Discrete graph diffusion models (DGDMs) have recently extended them for graph generation, which are crucial in fields like molecule and protein modeling, and obtained the SOTA performance. However, it is risky to deploy DGDMs for safety-critical applications (e.g., drug discovery) without understanding their security vulnerabilities. In this work, we perform the first study on graph diffusion models against backdoor attacks, a severe attack that manipulates both the training and inference/generation phases in graph diffusion models. We first define the threat model, under which we design the attack such that the backdoored graph diffusion model can generate 1) high-quality graphs without backdoor activation, 2) effective, stealthy, and persistent backdoored graphs with backdoor activation, and 3) graphs that are permutation invariant and exchangeable--two core properties in graph generative models. 1) and 2) are validated via empirical evaluations without and with backdoor defenses, while 3) is validated via theoretical results.

Backdoor Attacks on Discrete Graph Diffusion Models

TL;DR

We address the security risk of discrete graph diffusion models (DGDMs) by proposing the first backdoor attack against them, leveraging a subgraph trigger and a carefully designed forward diffusion to create a distinct backdoored limit distribution while preserving clean graph quality. The backdoored DiGress is shown to be permutation invariant and to generate exchangeable graphs, enabling robust stealth across node reorderings. Empirical results on QM9, MOSES, and GuacaMol demonstrate high attack success rates with minimal impact on validity/uniqueness, and transferability to DisCo confirms cross-DGDM vulnerability. The work highlights the need for defenses and motivates future directions toward provable guarantees and robust backdoor mitigation for graph diffusion-based generation.

Abstract

Diffusion models are powerful generative models in continuous data domains such as image and video data. Discrete graph diffusion models (DGDMs) have recently extended them for graph generation, which are crucial in fields like molecule and protein modeling, and obtained the SOTA performance. However, it is risky to deploy DGDMs for safety-critical applications (e.g., drug discovery) without understanding their security vulnerabilities. In this work, we perform the first study on graph diffusion models against backdoor attacks, a severe attack that manipulates both the training and inference/generation phases in graph diffusion models. We first define the threat model, under which we design the attack such that the backdoored graph diffusion model can generate 1) high-quality graphs without backdoor activation, 2) effective, stealthy, and persistent backdoored graphs with backdoor activation, and 3) graphs that are permutation invariant and exchangeable--two core properties in graph generative models. 1) and 2) are validated via empirical evaluations without and with backdoor defenses, while 3) is validated via theoretical results.

Paper Structure

This paper contains 26 sections, 3 theorems, 23 equations, 13 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

(Backdoored DiGress is Permutation Invariant) Let $G^t = ({\bm{X}}^t, {\bm{E}}^t)$ be an intermediate noised (clean or backdoored) graph, and $\pi(G^t) = (\pi({\bm{X}}^t), \pi({\bm{E}}^t))$ be its permutation. Backdoored DiGress is permutation invariant, i.e., $p_{\theta_B}(\pi(G^t)) = \pi (p_{\thet

Figures (13)

  • Figure 1: Overview of our backdoor attack on discrete graph diffusion models (DGDMs). Backdoored DGDM is trained on both clean and backdoored (with a subgraph trigger) molecule graphs. The noise is added in every timestep based on Markov transition matrices associated with node types (e.g., C, N, F, O) and edge types (e.g., 'NoBond':$\emptyset$, 'SINGLE Bond':$-$, 'DOUBLE Bond':$=$, 'TRIPLE Bond':$\equiv$). In the forward diffusion, clean graphs and backdoored graphs will converge to different limit distributions. In the reverse denoising diffusion, a clean / backdoored graph is generated and denoised step by step starting from the limit distribution produced by clean / backdoored graphs.
  • Figure 2: QM9-clean
  • Figure 3: Moses-clean
  • Figure 4: Guacamol-clean
  • Figure 5: QM9-backdoor
  • ...and 8 more figures

Theorems & Definitions (5)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Proposition 1: xu2022geodiff