Table of Contents
Fetching ...

Graph Generation with $K^2$-trees

Yunhui Jang, Dongwoo Kim, Sungsoo Ahn

TL;DR

This work introduces a novel graph generation method leveraging $K^2-tree representation, originally designed for lossless graph compression, by introducing a Transformer-based architecture designed to generate the sequence by incorporating a specialized tree positional encoding scheme.

Abstract

Generating graphs from a target distribution is a significant challenge across many domains, including drug discovery and social network analysis. In this work, we introduce a novel graph generation method leveraging $K^2$-tree representation, originally designed for lossless graph compression. The $K^2$-tree representation {encompasses inherent hierarchy while enabling compact graph generation}. In addition, we make contributions by (1) presenting a sequential $K^2$-treerepresentation that incorporates pruning, flattening, and tokenization processes and (2) introducing a Transformer-based architecture designed to generate the sequence by incorporating a specialized tree positional encoding scheme. Finally, we extensively evaluate our algorithm on four general and two molecular graph datasets to confirm its superiority for graph generation.

Graph Generation with $K^2$-trees

TL;DR

This work introduces a novel graph generation method leveraging $K^2-tree representation, originally designed for lossless graph compression, by introducing a Transformer-based architecture designed to generate the sequence by incorporating a specialized tree positional encoding scheme.

Abstract

Generating graphs from a target distribution is a significant challenge across many domains, including drug discovery and social network analysis. In this work, we introduce a novel graph generation method leveraging -tree representation, originally designed for lossless graph compression. The -tree representation {encompasses inherent hierarchy while enabling compact graph generation}. In addition, we make contributions by (1) presenting a sequential -treerepresentation that incorporates pruning, flattening, and tokenization processes and (2) introducing a Transformer-based architecture designed to generate the sequence by incorporating a specialized tree positional encoding scheme. Finally, we extensively evaluate our algorithm on four general and two molecular graph datasets to confirm its superiority for graph generation.
Paper Structure (31 sections, 16 figures, 8 tables, 4 algorithms)

This paper contains 31 sections, 16 figures, 8 tables, 4 algorithms.

Figures (16)

  • Figure 1: (Left) Various representations used for graph generation.(Right) Comparing graph generative methods in terms of used graph representation. The comparison is made with respect to a method being hierarchical (H), able to handle attributed graphs (A), and domain-agnostic (DA).
  • Figure 2: $K^{2}$--tree with $K=2$. The $K^{2}$--tree describes the hierarchy of the adjacency matrix iteratively being partitioned to $K\times K$ submatrices. It is compact due to summarizing any zero-filled submatrix with a size larger than $1\times 1$ (shaded in grey) by a leaf node $u$ with label $x_{u}=0$.
  • Figure 3: Illustration of the sequential representation for $K^{2}$--tree. The shaded parts of the adjacency matrix $A$ and the $K^{2}$--tree $\mathcal{T}$ denote redundant parts, which are further pruned, while the purple-colored parts of $A$ and $\mathcal{T}$ denote non-redundant parts. Also, same-colored tree-nodes of pruned $K^{2}$--tree are grouped and tokenized into the same colored parts of the sequence $\bm{y}$.
  • Figure 4: Illustration of the tree-node positions of $K^{2}$--tree. The shaded parts of the adjacency matrix denote redundant parts, e.g., $p_{u}<q_{u}$. Additionally, colored elements correspond to tree-nodes of the same color and the same-colored tree-edges signify the root-to-target downward path. Blue and red tuples denote the order in the first and second levels, respectively. The tree node $u$ is non-redundant as $p_u>q_u$ while $v$ is redundant as $p_{v}<q_{v}$.
  • Figure 5: An example of featured $K^{2}$--tree representation. The shaded parts of the adjacency matrix and $K^{2}$--tree denote the redundant parts. The black-colored tree-nodes denote the normal tree-nodes with binary attributes while other-colored feature elements in the adjacency matrix $A$ denote the same-colored featured tree-nodes and sequence elements. The node features (i.e., C and N) and edge feature (i.e., single bond $-$) of the molecule are represented within the leaf nodes.
  • ...and 11 more figures