Table of Contents
Fetching ...

Learning Deep Generative Models of Graphs

Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia

TL;DR

This paper introduces a general, deep generative model for graphs that operates in a sequential manner using graph nets to maintain structure-aware representations. It defines three decision modules to incrementally build graphs by adding nodes, adding edges, and connecting to existing nodes, with isomorphism-invariant propagation enabling context-aware probabilities. The approach is evaluated on synthetic graph generation, molecule generation, and conditional graph generation, outperforming Erdős–Rényi and LSTM baselines in many settings and showing strong capabilities with conditioning and in handling graph symmetries. The work highlights practical challenges around ordering, long generation sequences, and scalability, and outlines promising directions such as learning orderings, coarse-to-fine generation, and reinforcement-learning–based objective optimization. Overall, it demonstrates that graph nets can enable flexible, expressive generative modeling over arbitrary graphs, with significant implications for domains like chemistry and natural language processing.

Abstract

Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry. Here we introduce a powerful new approach for learning generative models over graphs, which can capture both their structure and attributes. Our approach uses graph neural networks to express probabilistic dependencies among a graph's nodes and edges, and can, in principle, learn distributions over any arbitrary graph. In a series of experiments our results show that once trained, our models can generate good quality samples of both synthetic graphs as well as real molecular graphs, both unconditionally and conditioned on data. Compared to baselines that do not use graph-structured representations, our models often perform far better. We also explore key challenges of learning generative models of graphs, such as how to handle symmetries and ordering of elements during the graph generation process, and offer possible solutions. Our work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures.

Learning Deep Generative Models of Graphs

TL;DR

This paper introduces a general, deep generative model for graphs that operates in a sequential manner using graph nets to maintain structure-aware representations. It defines three decision modules to incrementally build graphs by adding nodes, adding edges, and connecting to existing nodes, with isomorphism-invariant propagation enabling context-aware probabilities. The approach is evaluated on synthetic graph generation, molecule generation, and conditional graph generation, outperforming Erdős–Rényi and LSTM baselines in many settings and showing strong capabilities with conditioning and in handling graph symmetries. The work highlights practical challenges around ordering, long generation sequences, and scalability, and outlines promising directions such as learning orderings, coarse-to-fine generation, and reinforcement-learning–based objective optimization. Overall, it demonstrates that graph nets can enable flexible, expressive generative modeling over arbitrary graphs, with significant implications for domains like chemistry and natural language processing.

Abstract

Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry. Here we introduce a powerful new approach for learning generative models over graphs, which can capture both their structure and attributes. Our approach uses graph neural networks to express probabilistic dependencies among a graph's nodes and edges, and can, in principle, learn distributions over any arbitrary graph. In a series of experiments our results show that once trained, our models can generate good quality samples of both synthetic graphs as well as real molecular graphs, both unconditionally and conditioned on data. Compared to baselines that do not use graph-structured representations, our models often perform far better. We also explore key challenges of learning generative models of graphs, such as how to handle symmetries and ordering of elements during the graph generation process, and offer possible solutions. Our work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures.

Paper Structure

This paper contains 33 sections, 28 equations, 16 figures, 6 tables.

Figures (16)

  • Figure 1: Depiction of the steps taken during the generation process.
  • Figure 2: Illustration of the graph propagation process (left), graph level predictions using $f_\textit{addnode}$ and $f_\textit{addedge}$ (center), and node selection $f_\textit{nodes}$ modules (right).
  • Figure 3: Training curves for the graph model and LSTM model on three sets.
  • Figure 4: Degree histogram for samples generated by models trained on Barabasi--Albert Graphs. The histogram labeled "Ground Truth" shows the data distribution estimated from 10,000 examples.
  • Figure 5: NNc1nncc(O)n1
  • ...and 11 more figures