Backdoor Attacks on Discrete Graph Diffusion Models

Jiawen Wang; Samin Karim; Yuan Hong; Binghui Wang

Backdoor Attacks on Discrete Graph Diffusion Models

Jiawen Wang, Samin Karim, Yuan Hong, Binghui Wang

TL;DR

We address the security risk of discrete graph diffusion models (DGDMs) by proposing the first backdoor attack against them, leveraging a subgraph trigger and a carefully designed forward diffusion to create a distinct backdoored limit distribution $({\bm{m}}_{X_B}, {\bm{m}}_{E_B})$ while preserving clean graph quality. The backdoored DiGress is shown to be permutation invariant and to generate exchangeable graphs, enabling robust stealth across node reorderings. Empirical results on QM9, MOSES, and GuacaMol demonstrate high attack success rates with minimal impact on validity/uniqueness, and transferability to DisCo confirms cross-DGDM vulnerability. The work highlights the need for defenses and motivates future directions toward provable guarantees and robust backdoor mitigation for graph diffusion-based generation.

Abstract

Diffusion models are powerful generative models in continuous data domains such as image and video data. Discrete graph diffusion models (DGDMs) have recently extended them for graph generation, which are crucial in fields like molecule and protein modeling, and obtained the SOTA performance. However, it is risky to deploy DGDMs for safety-critical applications (e.g., drug discovery) without understanding their security vulnerabilities. In this work, we perform the first study on graph diffusion models against backdoor attacks, a severe attack that manipulates both the training and inference/generation phases in graph diffusion models. We first define the threat model, under which we design the attack such that the backdoored graph diffusion model can generate 1) high-quality graphs without backdoor activation, 2) effective, stealthy, and persistent backdoored graphs with backdoor activation, and 3) graphs that are permutation invariant and exchangeable--two core properties in graph generative models. 1) and 2) are validated via empirical evaluations without and with backdoor defenses, while 3) is validated via theoretical results.

Backdoor Attacks on Discrete Graph Diffusion Models

TL;DR

Abstract

Backdoor Attacks on Discrete Graph Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (5)