A Simple and Scalable Representation for Graph Generation

Yunhui Jang; Seul Lee; Sungsoo Ahn

A Simple and Scalable Representation for Graph Generation

Yunhui Jang, Seul Lee, Sungsoo Ahn

TL;DR

This work addresses the scalability bottleneck of graph-generation methods that rely on quadratic adjacency-matrix representations by introducing GEEL, a gap-encoded edge-list representation with a vocabulary bounded by $B^2$ and a representation size of $M$. By pairing GEEL with node-position embeddings and an autoregressive LSTM generator, the approach achieves $O(M)$ generation complexity and improves scalability, further extended to attributed graphs through a grammar. Empirically, GEEL yields state-of-the-art or competitive results across ten general graph benchmarks and two molecular datasets, with faster inference and reduced memory demands due to the compact representation. The method thus offers a practical, scalable path for generating large graphs and molecules, with released code for reproducibility.

Abstract

Recently, there has been a surge of interest in employing neural networks for graph generation, a fundamental statistical learning problem with critical applications like molecule design and community analysis. However, most approaches encounter significant limitations when generating large-scale graphs. This is due to their requirement to output the full adjacency matrices whose size grows quadratically with the number of nodes. In response to this challenge, we introduce a new, simple, and scalable graph representation named gap encoded edge list (GEEL) that has a small representation size that aligns with the number of edges. In addition, GEEL significantly reduces the vocabulary size by incorporating the gap encoding and bandwidth restriction schemes. GEEL can be autoregressively generated with the incorporation of node positional encoding, and we further extend GEEL to deal with attributed graphs by designing a new grammar. Our findings reveal that the adoption of this compact representation not only enhances scalability but also bolsters performance by simplifying the graph generation process. We conduct a comprehensive evaluation across ten non-attributed and two molecular graph generation tasks, demonstrating the effectiveness of GEEL.

A Simple and Scalable Representation for Graph Generation

TL;DR

and a representation size of

. By pairing GEEL with node-position embeddings and an autoregressive LSTM generator, the approach achieves

generation complexity and improves scalability, further extended to attributed graphs through a grammar. Empirically, GEEL yields state-of-the-art or competitive results across ten general graph benchmarks and two molecular datasets, with faster inference and reduced memory demands due to the compact representation. The method thus offers a practical, scalable path for generating large graphs and molecules, with released code for reproducibility.

Abstract

Paper Structure (35 sections, 9 equations, 16 figures, 17 tables)

This paper contains 35 sections, 9 equations, 16 figures, 17 tables.

Introduction
Related work
Method
Gap encoded edge list representation (GEEL)
Autoregressive generation of GEEL and node positional encoding
GEEL for attributed graphs
Experiment
General graph generation
Molecular graph generation
Ablation studies
Conclusion
Experimental Details
General graph generation
Molecular graph generation
Implementation Details
...and 20 more sections

Figures (16)

Figure 1: Overview and advantages of gap encoded edge list (GEEL).
Figure 2: Bandwidth of an adjacency matrix.
Figure 3: An example of attributed GEEL. The colored parts of the attributed GEEL denote the node features (i.e., C and N) and edge features (i.e., single bond -). The shaded parts denote the self-loops added to the original GEEL, where self-loops are added to the nodes that are not connected to the nodes with larger node indices (i.e., nodes with indices 3 and 4).
Figure 4: Infer. time on various graph sizes.
Figure 5: Average MMD results for different model architectures.
...and 11 more figures

A Simple and Scalable Representation for Graph Generation

TL;DR

Abstract

A Simple and Scalable Representation for Graph Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (16)