Table of Contents
Fetching ...

Overcoming Order in Autoregressive Graph Generation

Edo Cohen-Karlik, Eyal Rozenberg, Daniel Freedman

TL;DR

The paper addresses the challenge that autoregressive graph generation imposes an arbitrary order on graphs, which can hinder learning when data are scarce. It introduces Orderless Regularization (OLR), a training objective that enforces total structure invariance across DFS traversals sharing the same end vertex, effectively regularizing the model to be invariant to DFS orderings. Empirically, OLR improves performance on Wiener index regression and de novo molecule generation (MOSES/ZINC), demonstrating particular strength under data scarcity and suggesting broader applicability to graph-synthesis tasks. Overall, the work provides a principled way to inject structure-invariance bias into autoregressive graph generators, advancing their reliability and applicability to real-world graph synthesis problems.

Abstract

Graph generation is a fundamental problem in various domains, including chemistry and social networks. Recent work has shown that molecular graph generation using recurrent neural networks (RNNs) is advantageous compared to traditional generative approaches which require converting continuous latent representations into graphs. One issue which arises when treating graph generation as sequential generation is the arbitrary order of the sequence which results from a particular choice of graph flattening method. In this work we propose using RNNs, taking into account the non-sequential nature of graphs by adding an Orderless Regularization (OLR) term that encourages the hidden state of the recurrent model to be invariant to different valid orderings present under the training distribution. We demonstrate that sequential graph generation models benefit from our proposed regularization scheme, especially when data is scarce. Our findings contribute to the growing body of research on graph generation and provide a valuable tool for various applications requiring the synthesis of realistic and diverse graph structures.

Overcoming Order in Autoregressive Graph Generation

TL;DR

The paper addresses the challenge that autoregressive graph generation imposes an arbitrary order on graphs, which can hinder learning when data are scarce. It introduces Orderless Regularization (OLR), a training objective that enforces total structure invariance across DFS traversals sharing the same end vertex, effectively regularizing the model to be invariant to DFS orderings. Empirically, OLR improves performance on Wiener index regression and de novo molecule generation (MOSES/ZINC), demonstrating particular strength under data scarcity and suggesting broader applicability to graph-synthesis tasks. Overall, the work provides a principled way to inject structure-invariance bias into autoregressive graph generators, advancing their reliability and applicability to real-world graph synthesis problems.

Abstract

Graph generation is a fundamental problem in various domains, including chemistry and social networks. Recent work has shown that molecular graph generation using recurrent neural networks (RNNs) is advantageous compared to traditional generative approaches which require converting continuous latent representations into graphs. One issue which arises when treating graph generation as sequential generation is the arbitrary order of the sequence which results from a particular choice of graph flattening method. In this work we propose using RNNs, taking into account the non-sequential nature of graphs by adding an Orderless Regularization (OLR) term that encourages the hidden state of the recurrent model to be invariant to different valid orderings present under the training distribution. We demonstrate that sequential graph generation models benefit from our proposed regularization scheme, especially when data is scarce. Our findings contribute to the growing body of research on graph generation and provide a valuable tool for various applications requiring the synthesis of realistic and diverse graph structures.
Paper Structure (31 sections, 2 theorems, 17 equations, 3 figures, 2 tables)

This paper contains 31 sections, 2 theorems, 17 equations, 3 figures, 2 tables.

Key Result

Proposition 3.3

For a connected graph $G$,

Figures (3)

  • Figure 1: Illustration of two DFS traversals of the same graph starting from node $A$ and terminating at node $D$, blue lines denote traversal order. (Left) traversal resulting in the sequence $A(BEF)(C)D$. (Right) traversal resulting in the sequence $A(C)(BFE)D$. The parentheses denote the opening and closing of branches when traversing the tree; with this syntax it is possible to reconstruct the tree from such sequences. Note that multiple sequences correspond to the same tree, a fact that lies at the heart of this work.
  • Figure 2: Illustration of the same graph with two connected subgraphs: (Left) subgraph which is not induced by DFS. (Right) subgraph induced by DFS, arrows depict a traversal resulting in the sequence $BA(CF)D$.
  • Figure 3: Proof illustration - $S$ has a cycle and two different trajectories starting from $u$ and ending with $w$ ($urw$ and $uw(r)$. Concatenating with the trajectory from $z$ to $v$ we obtain two different DFS trajectories with a shared suffix.

Theorems & Definitions (10)

  • Definition 2.1
  • Definition 2.2
  • Definition 3.1
  • Definition 3.2
  • Proposition 3.3
  • Definition 3.4
  • Definition 3.5
  • Proposition 3.6
  • proof : Proof Sketch.
  • proof