Table of Contents
Fetching ...

Graph U-Nets

Hongyang Gao, Shuiwang Ji

TL;DR

The paper addresses representation learning on graphs by introducing graph U-Nets, an encoder-decoder framework built with novel graph pooling (gPool) and graph unpooling (gUnpool) layers to enable hierarchical downsampling and upsampling on irregular graphs. gPool uses a trainable projection to select the most informative nodes, while gUnpool restores the original graph structure using saved node indices; these are integrated into a U-Net-like architecture with GCNs and skip connections. To maintain connectivity after pooling, the authors augment graphs with their second power and introduce an enhanced GCN that weights self-loops more strongly. Empirical results on node and graph classification show consistent improvements over baselines, with ablations confirming the utility of gPool/gUnpool and the connectivity augmentation, and deeper networks offering gains up to a point before overfitting.

Abstract

We consider the problem of representation learning for graph data. Convolutional neural networks can naturally operate on images, but have significant challenges in dealing with graph data. Given images are special cases of graphs with nodes lie on 2D lattices, graph embedding tasks have a natural correspondence with image pixel-wise prediction tasks such as segmentation. While encoder-decoder architectures like U-Nets have been successfully applied on many image pixel-wise prediction tasks, similar methods are lacking for graph data. This is due to the fact that pooling and up-sampling operations are not natural on graph data. To address these challenges, we propose novel graph pooling (gPool) and unpooling (gUnpool) operations in this work. The gPool layer adaptively selects some nodes to form a smaller graph based on their scalar projection values on a trainable projection vector. We further propose the gUnpool layer as the inverse operation of the gPool layer. The gUnpool layer restores the graph into its original structure using the position information of nodes selected in the corresponding gPool layer. Based on our proposed gPool and gUnpool layers, we develop an encoder-decoder model on graph, known as the graph U-Nets. Our experimental results on node classification and graph classification tasks demonstrate that our methods achieve consistently better performance than previous models.

Graph U-Nets

TL;DR

The paper addresses representation learning on graphs by introducing graph U-Nets, an encoder-decoder framework built with novel graph pooling (gPool) and graph unpooling (gUnpool) layers to enable hierarchical downsampling and upsampling on irregular graphs. gPool uses a trainable projection to select the most informative nodes, while gUnpool restores the original graph structure using saved node indices; these are integrated into a U-Net-like architecture with GCNs and skip connections. To maintain connectivity after pooling, the authors augment graphs with their second power and introduce an enhanced GCN that weights self-loops more strongly. Empirical results on node and graph classification show consistent improvements over baselines, with ablations confirming the utility of gPool/gUnpool and the connectivity augmentation, and deeper networks offering gains up to a point before overfitting.

Abstract

We consider the problem of representation learning for graph data. Convolutional neural networks can naturally operate on images, but have significant challenges in dealing with graph data. Given images are special cases of graphs with nodes lie on 2D lattices, graph embedding tasks have a natural correspondence with image pixel-wise prediction tasks such as segmentation. While encoder-decoder architectures like U-Nets have been successfully applied on many image pixel-wise prediction tasks, similar methods are lacking for graph data. This is due to the fact that pooling and up-sampling operations are not natural on graph data. To address these challenges, we propose novel graph pooling (gPool) and unpooling (gUnpool) operations in this work. The gPool layer adaptively selects some nodes to form a smaller graph based on their scalar projection values on a trainable projection vector. We further propose the gUnpool layer as the inverse operation of the gPool layer. The gUnpool layer restores the graph into its original structure using the position information of nodes selected in the corresponding gPool layer. Based on our proposed gPool and gUnpool layers, we develop an encoder-decoder model on graph, known as the graph U-Nets. Our experimental results on node classification and graph classification tasks demonstrate that our methods achieve consistently better performance than previous models.

Paper Structure

This paper contains 17 sections, 4 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: An illustration of the proposed graph pooling layer with $k=2$. $\times$ and $\odot$ denote matrix multiplication and element-wise product, respectively. We consider a graph with 4 nodes, and each node has 5 features. By processing this graph, we obtain the adjacency matrix $A^\ell \in \mathbb{R}^{4 \times 4}$ and the input feature matrix $X^\ell \in \mathbb{R}^{4 \times 5}$ of layer $\ell$. In the projection stage, $\mathbf p \in \mathbb{R}^{5}$ is a trainable projection vector. By matrix multiplication and $\hbox{sigmoid}(\cdot)$, we obtain $\mathbf y$ that are scores estimating scalar projection values of each node to the projection vector. By using $k=2$, we select two nodes with the highest scores and record their indices in the top-k-node selection stage. We use the indices to extract the corresponding nodes to form a new graph, resulting in the pooled feature map $\tilde{X}^{\ell}$ and new corresponding adjacency matrix $A^{\ell+1}$. At the gate stage, we perform element-wise multiplication between $\tilde{X}^{\ell}$ and the selected node scores vector $\mathbf{\tilde{y}}$, resulting in $X^{\ell+1}$. This graph pooling layer outputs $A^{\ell+1}$ and $X^{\ell+1}$.
  • Figure 2: An illustration of the proposed graph unpooling (gUnpool) layer. In this example, a graph with 7 nodes is down-sampled using a gPool layer, resulting in a coarsened graph with 4 nodes and position information of selected nodes. The corresponding gUnpool layer uses the position information to reconstruct the original graph structure by using empty feature vectors for unselected nodes.
  • Figure 3: An illustration of the proposed graph U-Nets (g-U-Nets). In this example, each node in the input graph has two features. The input feature vectors are transformed into low-dimensional representations using a GCN layer. After that, we stack two encoder blocks, each of which contains a gPool layer and a GCN layer. In the decoder part, there are also two decoder blocks. Each block consists of a gUnpool layer and a GCN layer. For blocks in the same level, encoder block uses skip connection to fuse the low-level spatial features from the encoder block. The output feature vectors of nodes in the last layer are network embedding, which can be used for various tasks such as node classification and link prediction.