Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling
Markus Krimmel, Jenna Wiens, Karsten Borgwardt, Dexiong Chen
TL;DR
ANFM tackles the bottleneck of fast yet expressive graph generation by introducing Autoregressive Noisy Filtration Modeling, which reverses noise-augmented filtration sequences to generate graphs. It combines a structure-aware GNN with a transformer-based temporal mixer and uses a mixture of Bernoulli edge decoders to allow edge additions and deletions, yielding short sequences with $T$ typically small (e.g., $T\leq 32$). The training proceeds in two stages: teacher-forcing on noisy filtrations and adversarial fine-tuning via PPO against a GraphGPS discriminator, addressing exposure bias and improving sample quality. Empirically, ANFM achieves up to a 100x speedup over diffusion models while maintaining competitive accuracy across synthetic and real-world datasets, with ablations confirming the importance of noise augmentation, RL tuning, and filtration granularity. The work advances high-throughput graph generation and opens avenues for attributed graphs and more robust benchmarking.
Abstract
Graph generative models often face a critical trade-off between learning complex distributions and achieving fast generation speed. We introduce Autoregressive Noisy Filtration Modeling (ANFM), a novel approach that addresses both challenges. ANFM leverages filtration, a concept from topological data analysis, to transform graphs into short sequences of monotonically increasing subgraphs. This formulation extends the sequence families used in previous autoregressive models. To learn from these sequences, we propose a novel autoregressive graph mixer model. Our experiments suggest that exposure bias might represent a substantial hurdle in autoregressive graph generation and we introduce two mitigation strategies to address it: noise augmentation and a reinforcement learning approach. Incorporating these techniques leads to substantial performance gains, making ANFM competitive with state-of-the-art diffusion models across diverse synthetic and real-world datasets. Notably, ANFM produces remarkably short sequences, achieving a 100-fold speedup in generation time compared to diffusion models. This work marks a significant step toward high-throughput graph generation.
