Table of Contents
Fetching ...

Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

Markus Krimmel, Jenna Wiens, Karsten Borgwardt, Dexiong Chen

TL;DR

ANFM tackles the bottleneck of fast yet expressive graph generation by introducing Autoregressive Noisy Filtration Modeling, which reverses noise-augmented filtration sequences to generate graphs. It combines a structure-aware GNN with a transformer-based temporal mixer and uses a mixture of Bernoulli edge decoders to allow edge additions and deletions, yielding short sequences with $T$ typically small (e.g., $T\leq 32$). The training proceeds in two stages: teacher-forcing on noisy filtrations and adversarial fine-tuning via PPO against a GraphGPS discriminator, addressing exposure bias and improving sample quality. Empirically, ANFM achieves up to a 100x speedup over diffusion models while maintaining competitive accuracy across synthetic and real-world datasets, with ablations confirming the importance of noise augmentation, RL tuning, and filtration granularity. The work advances high-throughput graph generation and opens avenues for attributed graphs and more robust benchmarking.

Abstract

Graph generative models often face a critical trade-off between learning complex distributions and achieving fast generation speed. We introduce Autoregressive Noisy Filtration Modeling (ANFM), a novel approach that addresses both challenges. ANFM leverages filtration, a concept from topological data analysis, to transform graphs into short sequences of monotonically increasing subgraphs. This formulation extends the sequence families used in previous autoregressive models. To learn from these sequences, we propose a novel autoregressive graph mixer model. Our experiments suggest that exposure bias might represent a substantial hurdle in autoregressive graph generation and we introduce two mitigation strategies to address it: noise augmentation and a reinforcement learning approach. Incorporating these techniques leads to substantial performance gains, making ANFM competitive with state-of-the-art diffusion models across diverse synthetic and real-world datasets. Notably, ANFM produces remarkably short sequences, achieving a 100-fold speedup in generation time compared to diffusion models. This work marks a significant step toward high-throughput graph generation.

Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

TL;DR

ANFM tackles the bottleneck of fast yet expressive graph generation by introducing Autoregressive Noisy Filtration Modeling, which reverses noise-augmented filtration sequences to generate graphs. It combines a structure-aware GNN with a transformer-based temporal mixer and uses a mixture of Bernoulli edge decoders to allow edge additions and deletions, yielding short sequences with typically small (e.g., ). The training proceeds in two stages: teacher-forcing on noisy filtrations and adversarial fine-tuning via PPO against a GraphGPS discriminator, addressing exposure bias and improving sample quality. Empirically, ANFM achieves up to a 100x speedup over diffusion models while maintaining competitive accuracy across synthetic and real-world datasets, with ablations confirming the importance of noise augmentation, RL tuning, and filtration granularity. The work advances high-throughput graph generation and opens avenues for attributed graphs and more robust benchmarking.

Abstract

Graph generative models often face a critical trade-off between learning complex distributions and achieving fast generation speed. We introduce Autoregressive Noisy Filtration Modeling (ANFM), a novel approach that addresses both challenges. ANFM leverages filtration, a concept from topological data analysis, to transform graphs into short sequences of monotonically increasing subgraphs. This formulation extends the sequence families used in previous autoregressive models. To learn from these sequences, we propose a novel autoregressive graph mixer model. Our experiments suggest that exposure bias might represent a substantial hurdle in autoregressive graph generation and we introduce two mitigation strategies to address it: noise augmentation and a reinforcement learning approach. Incorporating these techniques leads to substantial performance gains, making ANFM competitive with state-of-the-art diffusion models across diverse synthetic and real-world datasets. Notably, ANFM produces remarkably short sequences, achieving a 100-fold speedup in generation time compared to diffusion models. This work marks a significant step toward high-throughput graph generation.

Paper Structure

This paper contains 99 sections, 5 theorems, 39 equations, 6 figures, 27 tables, 1 algorithm.

Key Result

Proposition 3.1

The asymptotic runtime complexity for sampling a graph with $N$ nodes from an ANFM with $T$ timesteps is:

Figures (6)

  • Figure 1: Top: A graph is transformed into a sequence of subgraphs (filtration) via edge-deletion. Bottom left: the generator is trained via teacher-forcing to reverse the filtration process. Bottom right: the generator is fine-tuned in free-running mode via reinforcement learning on a reward signal output by a discriminator in a SeqGAN-like framework (c.f. Appendix \ref{['appendix:adversarial-finetuning']}).
  • Figure 2: Performance of ANFM and DiGress on the expanded planar dataset as number of generation steps varies.
  • Figure 3: Uncurated samples from ANFM model trained on GuacaMol.
  • Figure 4: Uncurated samples from ANFM (line Fiedler variant).
  • Figure 5: SBM validity during training of ESGG on expanded SBM dataset. Validity is computed using 1000 refinement steps in validation but 100 refinement steps during testing to remain consistent with other baselines.
  • ...and 1 more figures

Theorems & Definitions (12)

  • Proposition 3.1
  • proof
  • Proposition 3.2
  • proof
  • Proposition 3.3
  • proof
  • Proposition 3.4
  • proof
  • Definition 10.1
  • Proposition 10.2
  • ...and 2 more