GraphRCG: Self-Conditioned Graph Generation
Song Wang, Zhen Tan, Xinyu Zhao, Tianlong Chen, Huan Liu, Jundong Li
TL;DR
GraphRCG addresses the challenge of generating graphs that faithfully reflect training distributions by explicitly modeling the distribution through a Representation Diffusion Model that maps graphs to low-dimensional representations. It then uses bootstrapped, denoised representations to progressively guide graph generation in a step-wise, self-conditioned manner via a graph transformer, enabling high-fidelity graphs across both generic and molecular domains. The framework demonstrates superior performance over state-of-the-art baselines on diverse metrics, including structural distributional metrics and molecular validity, and supports representation interpolation to reveal a coherent distribution space. This self-conditioned approach offers a principled, distributed conditioning mechanism for graph generation with connections to diffusion and attention-based architectures, potentially enabling robust conditional graph generation in chemistry and networked systems.
Abstract
Graph generation generally aims to create new graphs that closely align with a specific graph distribution. Existing works often implicitly capture this distribution through the optimization of generators, potentially overlooking the intricacies of the distribution itself. Furthermore, these approaches generally neglect the insights offered by the learned distribution for graph generation. In contrast, in this work, we propose a novel self-conditioned graph generation framework designed to explicitly model graph distributions and employ these distributions to guide the generation process. We first perform self-conditioned modeling to capture the graph distributions by transforming each graph sample into a low-dimensional representation and optimizing a representation generator to create new representations reflective of the learned distribution. Subsequently, we leverage these bootstrapped representations as self-conditioned guidance for the generation process, thereby facilitating the generation of graphs that more accurately reflect the learned distributions. We conduct extensive experiments on generic and molecular graph datasets across various fields. Our framework demonstrates superior performance over existing state-of-the-art graph generation methods in terms of graph quality and fidelity to training data.
