A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Zhangyang Gao; Daize Dong; Cheng Tan; Jun Xia; Bozhen Hu; Stan Z. Li

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Zhangyang Gao, Daize Dong, Cheng Tan, Jun Xia, Bozhen Hu, Stan Z. Li

TL;DR

GraphsGPT introduces a pure-Transformer pipeline that converts Non-Euclidean graphs into a fixed-length Euclidean sequence of Graph Words via Graph2Seq and then reconstructs the original graph with GraphGPT. The edge-centric generation and decoupled Graph Position Encodings enable end-to-end representation and generation, trained through a GPT-style self-supervised objective on ~100M molecules. Pretraining yields state-of-the-art performance on multiple MoleculeNet tasks for representation, while enabling few-shot and controllable graph generation and Euclidean space graph mixup. The framework demonstrates permutation robustness and opens a new paradigm for transforming graph data into and from Euclidean latent spaces for manipulation and optimization.

Abstract

Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at \href{https://github.com/A4Bio/GraphsGPT}{GitHub}.

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

TL;DR

Abstract

M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on

graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at \href{https://github.com/A4Bio/GraphsGPT}{GitHub}.

Paper Structure (49 sections, 15 equations, 10 figures, 8 tables, 1 algorithm)

This paper contains 49 sections, 15 equations, 10 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Graph2Vec.
Graph Transformers.
Graph Self-Supervised Learning.
Motivation.
Method
Overall Framework
Graph2Seq Encoder
Flexible Token Sequence ($\texttt{FTSeq}$).
Euclidean Graph Words.
Graph Vocabulary.
GraphGPT Decoder
Edge-Centric Graph Generation.
Step 0: First Node Initialization.
...and 34 more sections

Figures (10)

Figure 1: The Overall framework of GraphsGPT. Graph2Seq encoder transforms the Non-Euclidean graph into Euclidean Graph Words, which are further fed into GraphGPT decoder to auto-regressively generate the original Non-Euclidean graph. Both Graph2Seq and GraphGPT employ pure transformer as the structure.
Figure 2: Graph to Flexible Sequence.
Figure 3: Overview of edge-centric graph generation.
Figure 4: Block-Wise causal attention with grey cells indicating masked positions. Graph Words contribute to the generation through full attention, serving as prefix prompts.
Figure 5: Property distribution of generated molecules on different conditions using GraphsGPT-1W-C. "Dataset" denotes the distribution of the training dataset (ZINC-C).
...and 5 more figures

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

TL;DR

Abstract

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Authors

TL;DR

Abstract

Table of Contents

Figures (10)