Table of Contents
Fetching ...

Are Expressive Encoders Necessary for Discrete Graph Generation?

Jay Revolinsky, Harry Shomer, Jiliang Tang

TL;DR

A systematic ablation study shows the benefit provided by each GenGNN component, indicating the need for residual connections to mitigate oversmoothing on complicated graph-structure, and investigates learned diffusion representations to uncover whether GNNs can be expressive neural backbones for discrete diffusion.

Abstract

Discrete graph generation has emerged as a powerful paradigm for modeling graph data, often relying on highly expressive neural backbones such as transformers or higher-order architectures. We revisit this design choice by introducing GenGNN, a modular message-passing framework for graph generation. Diffusion models with GenGNN achieve more than 90% validity on Tree and Planar datasets, within margins of graph transformers, at 2-5x faster inference speed. For molecule generation, DiGress with a GenGNN backbone achieves 99.49% Validity. A systematic ablation study shows the benefit provided by each GenGNN component, indicating the need for residual connections to mitigate oversmoothing on complicated graph-structure. Through scaling analyses, we apply a principled metric-space view to investigate learned diffusion representations and uncover whether GNNs can be expressive neural backbones for discrete diffusion.

Are Expressive Encoders Necessary for Discrete Graph Generation?

TL;DR

A systematic ablation study shows the benefit provided by each GenGNN component, indicating the need for residual connections to mitigate oversmoothing on complicated graph-structure, and investigates learned diffusion representations to uncover whether GNNs can be expressive neural backbones for discrete diffusion.

Abstract

Discrete graph generation has emerged as a powerful paradigm for modeling graph data, often relying on highly expressive neural backbones such as transformers or higher-order architectures. We revisit this design choice by introducing GenGNN, a modular message-passing framework for graph generation. Diffusion models with GenGNN achieve more than 90% validity on Tree and Planar datasets, within margins of graph transformers, at 2-5x faster inference speed. For molecule generation, DiGress with a GenGNN backbone achieves 99.49% Validity. A systematic ablation study shows the benefit provided by each GenGNN component, indicating the need for residual connections to mitigate oversmoothing on complicated graph-structure. Through scaling analyses, we apply a principled metric-space view to investigate learned diffusion representations and uncover whether GNNs can be expressive neural backbones for discrete diffusion.
Paper Structure (34 sections, 9 theorems, 35 equations, 15 figures, 10 tables)

This paper contains 34 sections, 9 theorems, 35 equations, 15 figures, 10 tables.

Key Result

theorem 1

Given Eqn. eq:unit_distance and Assumptions ass:rrwp_nondeg1--ass:backbone_bound1, for all diffusion steps $t$ and the dominant eigenvector $v$. Where $\mu_{v}$ is the margin estimating node-signal collapse and $X_{\mathrm{out}}^{(t)}$ is the node-wise denoised output, then: In particular, if $\gamma>2C$ then the denoiser outputs cannot collapse to $\mathrm{span}\{v\}$ at any reverse diffusion st

Figures (15)

  • Figure 1: Planar graphs generated via DeFoG with a simple GNN and Graph Transformer backbone. We can see that the GNN fails to properly sample planar structure, instead producing several clustered communities.
  • Figure 2: The per-layer GenGNN framework, composed of modular components, in order: Node (X), Edge (y) Features w/ RRWP (shown in Orange), Edge Gating (EG), GNN/GINe/GCN layer, Node Gating (NG), Feed-Forward Networks (FFN), Residuals+Normalization (RN). Note: The blue and orange (RRWP) modules are ablateable. Yellow and green modules are always enabled during experimentation.
  • Figure 3: The top-5 relative inference speedups for GenGNN vs. PPGN and GT denoising backbones across permutations of tested datasets. Individual colors represent GT (blue) and PPGN (white), hatching corresponds to a given dataset.
  • Figure 4: The change in MMD and V.U.N. across individual ablated components of the GenGNN framework (log-scaled), with a simple GNN backbone (on right).
  • Figure 5: The change in Validity (top-left), Accuracy (bottom-left), MagDiff (top-center), and Accuracy (bottom-center) from layer depths 1 to 24 for the fully-enabled GenGNN and GT frameworks vs. GenGNN with RRWP and residual-normalization ablated, averaged over five runs. (top-right) The trade-off between MagDiff and Validity for the QM9 Dataset. (bottom-right) The trade-off between Average MMD Ratio and Accuracy for the Tree Dataset.
  • ...and 10 more figures

Theorems & Definitions (11)

  • definition 1: Node-wise Structural Dispersion
  • theorem 1: Uniform non-collapse of residual-anchored graph denoisers
  • corollary 1: Anchored denoising robustness
  • lemma 1: Equivalence of oversmoothing measures scholkemper2024residual
  • definition 2: Node-wise Structural Dispersion
  • lemma 2: Two-term lower bound in an orthogonal subspace
  • lemma 3: Positional Encoding anchor lower bounds residual
  • lemma 4: Outer Residual induces a deterministic non-collapse certificate
  • proposition 1: Per-step non-collapse bound for the denoiser
  • theorem 2: Uniform non-collapse of residual-anchored graph denoisers
  • ...and 1 more