Table of Contents
Fetching ...

Graph as Point Set

Xiyuan Wang, Pan Li, Muhan Zhang

TL;DR

This work reframes graph representation learning by mapping graphs to point sets via Symmetric Rank Decomposition of the augmented adjacency $L=D+A$, enabling permutation- and orthogonal-invariant set encoders to operate on graphs. The core contributions include PSRD coordinates and the Point Set Transformer (PST), which together achieve strong long-range and short-range expressivity, surpassing several GNNs and graph transformers on substructure counting and molecular-property benchmarks, while unifying various structural encodings under a principled, lossless framework. An alternative DeepSet-based encoder (PSDS) demonstrates the versatility of the approach. While offering substantial gains, the method retains scalability limitations inherent to Transformer-style attention on large graphs, suggesting future refinements such as sparse or linear attention to broaden applicability.

Abstract

Graph is a fundamental data structure to model interconnections between entities. Set, on the contrary, stores independent elements. To learn graph representations, current Graph Neural Networks (GNNs) primarily use message passing to encode the interconnections. In contrast, this paper introduces a novel graph-to-set conversion method that bijectively transforms interconnected nodes into a set of independent points and then uses a set encoder to learn the graph representation. This conversion method holds dual significance. Firstly, it enables using set encoders to learn from graphs, thereby significantly expanding the design space of GNNs. Secondly, for Transformer, a specific set encoder, we provide a novel and principled approach to inject graph information losslessly, different from all the heuristic structural/positional encoding methods adopted in previous graph transformers. To demonstrate the effectiveness of our approach, we introduce Point Set Transformer (PST), a transformer architecture that accepts a point set converted from a graph as input. Theoretically, PST exhibits superior expressivity for both short-range substructure counting and long-range shortest path distance tasks compared to existing GNNs. Extensive experiments further validate PST's outstanding real-world performance. Besides Transformer, we also devise a Deepset-based set encoder, which achieves performance comparable to representative GNNs, affirming the versatility of our graph-to-set method.

Graph as Point Set

TL;DR

This work reframes graph representation learning by mapping graphs to point sets via Symmetric Rank Decomposition of the augmented adjacency , enabling permutation- and orthogonal-invariant set encoders to operate on graphs. The core contributions include PSRD coordinates and the Point Set Transformer (PST), which together achieve strong long-range and short-range expressivity, surpassing several GNNs and graph transformers on substructure counting and molecular-property benchmarks, while unifying various structural encodings under a principled, lossless framework. An alternative DeepSet-based encoder (PSDS) demonstrates the versatility of the approach. While offering substantial gains, the method retains scalability limitations inherent to Transformer-style attention on large graphs, suggesting future refinements such as sparse or linear attention to broaden applicability.

Abstract

Graph is a fundamental data structure to model interconnections between entities. Set, on the contrary, stores independent elements. To learn graph representations, current Graph Neural Networks (GNNs) primarily use message passing to encode the interconnections. In contrast, this paper introduces a novel graph-to-set conversion method that bijectively transforms interconnected nodes into a set of independent points and then uses a set encoder to learn the graph representation. This conversion method holds dual significance. Firstly, it enables using set encoders to learn from graphs, thereby significantly expanding the design space of GNNs. Secondly, for Transformer, a specific set encoder, we provide a novel and principled approach to inject graph information losslessly, different from all the heuristic structural/positional encoding methods adopted in previous graph transformers. To demonstrate the effectiveness of our approach, we introduce Point Set Transformer (PST), a transformer architecture that accepts a point set converted from a graph as input. Theoretically, PST exhibits superior expressivity for both short-range substructure counting and long-range shortest path distance tasks compared to existing GNNs. Extensive experiments further validate PST's outstanding real-world performance. Besides Transformer, we also devise a Deepset-based set encoder, which achieves performance comparable to representative GNNs, affirming the versatility of our graph-to-set method.
Paper Structure (35 sections, 27 theorems, 51 equations, 4 figures, 13 tables)

This paper contains 35 sections, 27 theorems, 51 equations, 4 figures, 13 tables.

Key Result

Proposition 2.3

Matrices $Q_1$ and $Q_2$ in $\mathbb{R}^{n \times r}$ are SRD of the same matrix iff there exists $R \in O(r), Q_1 = Q_2 R$.

Figures (4)

  • Figure 1: Our method converts the input graph to a point set first and encoding it with a set encoder. $O(r)$ denotes the set of $r$-dimension orthogonal transformations.
  • Figure 2: The failure of using inner products of permutation-equivariant node representations to predict shortest path distance. $v_2$ and $v_3$ have equal node representations due to symmetry. Therefore, $(v_1, v_2)$ and $(v_1, v_3)$ will have the same inner products of node representations but different shortest path distance.
  • Figure 3: The pipeline of parameterized SRD. We first decompose Laplacian matrix or other matrice for the non-zero eigenvalue and the corresponding eigenvectors. Then the eigenvalue is transformed with DeepSet DeepSet. Multiply the transformed eigenvalue and the eigenvector leads to coordinates.
  • Figure 4: Architecture of Point Set Transformer (PST) (a) PST contains several layers. Each layer is composed of an scalar-vector (sv)-mixer and an attention layer. (b) The architecture of sv-mixer. (c) The architecture of attention layer. $s_i$ and $s_i'$ denote the scalar representations of node $i$, and $\vec{v}_i$ and $\vec{v}_i'$ denote the vector representations. $x_i$ is the initial features of node $i$. $Q_i$ and point coordinates of node $i$ produced by parameterized SRD in Section \ref{['sec::psrd']}.

Theorems & Definitions (47)

  • Definition 2.1
  • Definition 2.2
  • Proposition 2.3
  • Theorem 3.1
  • Definition 3.2
  • Theorem 3.3
  • Theorem 5.1
  • Theorem 5.2
  • Theorem 5.3
  • Theorem 5.4
  • ...and 37 more