Generating Graphs via Spectral Diffusion

Giorgia Minello; Alessandro Bicciato; Luca Rossi; Andrea Torsello; Luca Cosmo

Generating Graphs via Spectral Diffusion

Giorgia Minello, Alessandro Bicciato, Luca Rossi, Andrea Torsello, Luca Cosmo

TL;DR

This work develops GGSD, a diffusion-based graph generator that represents graphs via the Laplacian eigenspectrum and learns to sample eigenpairs $(\boldsymbol{\Phi},\boldsymbol{\lambda})$ to reconstruct the graph Laplacian $\mathbf{L}=\boldsymbol{\Phi}\boldsymbol{\Lambda}\boldsymbol{\Phi}^{\top}$ and adjacency. By truncating the spectrum to $k$ components, the diffusion cost scales linearly with the number of nodes $n$, and a permutation-invariant transformer-like backbone can incorporate node features by concatenating them to $\boldsymbol{\Phi}$. A two-stage pipeline couples Spectral Diffusion with a Provably Powerful Graph Network (PPGN) that refines a noisy Laplacian-based adjacency into a binary graph, enabling direct conditioning on spectral properties at inference. The model demonstrates competitive fidelity and controllability on synthetic and real-world graphs, with notable ability to condition generating graphs on targeted eigenvalues or eigenvectors, and to achieve speedups over quadratic-diffusion baselines. Limitations include selecting the most informative frequency components and the quadratic complexity of the predictor, suggesting avenues for future work in adaptive spectrum selection and sparse adjacency representations.

Abstract

In this paper, we present GGSD, a novel graph generative model based on 1) the spectral decomposition of the graph Laplacian matrix and 2) a diffusion process. Specifically, we propose to use a denoising model to sample eigenvectors and eigenvalues from which we can reconstruct the graph Laplacian and adjacency matrix. Using the Laplacian spectrum allows us to naturally capture the structural characteristics of the graph and work directly in the node space while avoiding the quadratic complexity bottleneck that limits the applicability of other diffusion-based methods. This, in turn, is accomplished by truncating the spectrum, which, as we show in our experiments, results in a faster yet accurate generative process, and by designing a novel transformer-based architecture linear in the number of nodes. Our permutation invariant model can also handle node features by concatenating them to the eigenvectors of each node. An extensive set of experiments on both synthetic and real-world graphs demonstrates the strengths of our model against state-of-the-art alternatives.

Generating Graphs via Spectral Diffusion

TL;DR

This work develops GGSD, a diffusion-based graph generator that represents graphs via the Laplacian eigenspectrum and learns to sample eigenpairs

to reconstruct the graph Laplacian

and adjacency. By truncating the spectrum to

components, the diffusion cost scales linearly with the number of nodes

, and a permutation-invariant transformer-like backbone can incorporate node features by concatenating them to

. A two-stage pipeline couples Spectral Diffusion with a Provably Powerful Graph Network (PPGN) that refines a noisy Laplacian-based adjacency into a binary graph, enabling direct conditioning on spectral properties at inference. The model demonstrates competitive fidelity and controllability on synthetic and real-world graphs, with notable ability to condition generating graphs on targeted eigenvalues or eigenvectors, and to achieve speedups over quadratic-diffusion baselines. Limitations include selecting the most informative frequency components and the quadratic complexity of the predictor, suggesting avenues for future work in adaptive spectrum selection and sparse adjacency representations.

Abstract

Paper Structure (30 sections, 10 equations, 8 figures, 6 tables)

This paper contains 30 sections, 10 equations, 8 figures, 6 tables.

Introduction
Related Work
Denoising Diffusion Models
Our Method
Spectral Diffusion
Graph Predictor
Experimental Evaluation
Datasets.
Evaluation Metrics.
Baselines.
Experimental Setup.
Evaluating the Generated Graphs
Synthetic Datasets
Real-world Datasets
Graph Predictor Ablation
...and 15 more sections

Figures (8)

Figure 1: GGSD pipeline. During the spectral diffusion process (left) the neural network is trained to predict the denoising steps for the eigenvectors $\phi$ and eigenvalues $\lambda$ of the graph Laplacian. The second stage of our method is the graph predictor (right), where we train a Provably Powerful Graph Network (PPGN) NEURIPS2019_bb04af0f (similar to what was done in SPECTRE martinkus2022spectre). Given the eigenvalues and eigenvectors generated, it predicts the adjacency matrix.
Figure 2: The score model takes as input the noisy eigenvector matrix and eigenvalues at time $t$ and predicts the noise of the data to be used in the denoising step. The $k$ node feature eigenvectors $\mathbf{\Phi}_t^0$ are projected through an MLP to a $d$ dimensional space. The sequence of $k$ eigenvalues is given as input to a 1D convolutional layer, which outputs $d$ features for each eigenvalue. Both eigenvectors and eigenvalues go through a series of $L$ layers composed of two multi-head cross-attention blocks, one updating the eigenvectors conditioned by the eigenvalues and one updating the eigenvalues conditioned on the eigenvectors. After each layer, we apply a residual block $\oplus$, which adds to the layer input the updated values scaled and shifted by time-dependent factors. Finally, $\mathbf{\Phi}_t^L$ and $\boldsymbol{\lambda}_t^L$ are projected to a $k$ dimensional space through an MLP and a 1D convolution.
Figure 3: Left column: Comparison of the eigenvectors generated by the diffusion module of GGSD with the eigenvectors recomputed on the Laplacian computed on the adjacency matrix predicted by the PPGN module on the Community (top) and SBM (bottom) datasets. The models have been trained with the 8 smallest eigenvectors. In the smaller dataset (Community) the diffusion generates nearly perfect eigenvectors. In the more challenging SBM dataset we can notice that the generated eigenfunction are slightly different from the one computed on the predicted graph while preserving the overall structure. Right column: comparison of the interpolated eigenvectors that SPECTRE uses to condition the PPGN module to the actual eigenvectors of the generated graph. In this case the eigenvectors structure is completely lost in the generative process.
Figure 4: Performance analysis without (Generated) and with (Orthonormal) reprojecting the generated eigenvectors to an orthonormal basis. The average error represents the mean degradation of metrics between the generated graphs and the training set. We report both the mean and the standard deviation as error bars on 10 generations of 200 graphs. Specifically, Degree, Cluster, and Spectral metrics are calculated between the generated graphs and the test set, then normalized by the metrics between the training and test sets.
Figure 4: Conditioning of the generation on the first 3 smallest eigenvectors as an inpainting task using RePaint lugmayr2022repaint (Figure). Number of communities in the graphs generated with spectrum conditioning (Table). Higher values should appear in the bold diagonal.
...and 3 more figures

Generating Graphs via Spectral Diffusion

TL;DR

Abstract

Generating Graphs via Spectral Diffusion

Authors

TL;DR

Abstract

Table of Contents

Figures (8)