Table of Contents
Fetching ...

Learning to Control the Smoothness of Graph Convolutional Network Features

Shih-Hsin Wang, Justin Baker, Cory Hauck, Bao Wang

TL;DR

This paper proposes a new strategy to let GCN learn node features with a desired smoothness -- adapting to data and tasks -- to enhance node classification and shows that the augmented message-passing schemes significantly improve node classification for GCN and some related models.

Abstract

The pioneering work of Oono and Suzuki [ICLR, 2020] and Cai and Wang [arXiv:2006.13318] initializes the analysis of the smoothness of graph convolutional network (GCN) features. Their results reveal an intricate empirical correlation between node classification accuracy and the ratio of smooth to non-smooth feature components. However, the optimal ratio that favors node classification is unknown, and the non-smooth features of deep GCN with ReLU or leaky ReLU activation function diminish. In this paper, we propose a new strategy to let GCN learn node features with a desired smoothness -- adapting to data and tasks -- to enhance node classification. Our approach has three key steps: (1) We establish a geometric relationship between the input and output of ReLU or leaky ReLU. (2) Building on our geometric insights, we augment the message-passing process of graph convolutional layers (GCLs) with a learnable term to modulate the smoothness of node features with computational efficiency. (3) We investigate the achievable ratio between smooth and non-smooth feature components for GCNs with the augmented message-passing scheme. Our extensive numerical results show that the augmented message-passing schemes significantly improve node classification for GCN and some related models.

Learning to Control the Smoothness of Graph Convolutional Network Features

TL;DR

This paper proposes a new strategy to let GCN learn node features with a desired smoothness -- adapting to data and tasks -- to enhance node classification and shows that the augmented message-passing schemes significantly improve node classification for GCN and some related models.

Abstract

The pioneering work of Oono and Suzuki [ICLR, 2020] and Cai and Wang [arXiv:2006.13318] initializes the analysis of the smoothness of graph convolutional network (GCN) features. Their results reveal an intricate empirical correlation between node classification accuracy and the ratio of smooth to non-smooth feature components. However, the optimal ratio that favors node classification is unknown, and the non-smooth features of deep GCN with ReLU or leaky ReLU activation function diminish. In this paper, we propose a new strategy to let GCN learn node features with a desired smoothness -- adapting to data and tasks -- to enhance node classification. Our approach has three key steps: (1) We establish a geometric relationship between the input and output of ReLU or leaky ReLU. (2) Building on our geometric insights, we augment the message-passing process of graph convolutional layers (GCLs) with a learnable term to modulate the smoothness of node features with computational efficiency. (3) We investigate the achievable ratio between smooth and non-smooth feature components for GCNs with the augmented message-passing scheme. Our extensive numerical results show that the augmented message-passing schemes significantly improve node classification for GCN and some related models.

Paper Structure

This paper contains 37 sections, 13 theorems, 77 equations, 4 figures, 11 tables.

Key Result

Proposition 2.1

All eigenvalues of the matrix ${\bm{G}}$ lie in the interval $(-1, 1]$. Furthermore, the nonnegative vectors $\{\Tilde{{\bm{D}}}^{\frac{1}{2}}{\bm{u}}_i/\|\Tilde{{\bm{D}}}^{\frac{1}{2}}{\bm{u}}_i\|\}_{1\leq i\leq m}$ form an orthonormal basis of ${\mathcal{M}}$.

Figures (4)

  • Figure 1: Contrasting the effects of varying parameter $\alpha$ on the smoothness and normalized smoothness of output node features $\sigma({\bm{z}}_\alpha)$ and $\sigma_a({\bm{z}}_\alpha)$. Notice that the discontinuity of $s(\sigma({\bm{z}}_\alpha))$ in panel b) comes from the definition of normalized smoothness. Moreover, note that $s({\bm{z}})=1$ if ${\bm{z}}=\bf 0$, and $\sigma({\bm{z}}_\alpha)$ can become $\bf 0$ when $\alpha$ is large enough.
  • Figure 2: Node feature trajectories, with colorized magnitude, for varying smoothness control parameter $\alpha$. For classical GCN b), the node features converge to the eigenspace ${\mathcal{M}}$ (red dashed line).
  • Figure 3: The normalized smoothness -- of each dimension of the feature vectors at a given layer -- for a) GCN and b) GCN-SCT on the Citeseer dataset with 32 layers and 16 hidden dimensions. GCN features become entirely smooth since layer 14, while GCN-SCT controls the smoothness for each feature at any depth. Horizontal and vertical axes represent the index of the feature dimension and the intermediate layer, respectively.
  • Figure 4: Training gradients for $||\partial {\bm{H}}^\text{out}/\partial {\bm{H}}^l||$ for $l\in[0,32]$ layers and 100 training epochs on the Citeseer dataset. Here, all models have 32 layers and 16 hidden dimensions for each layer. We observe that (a) GCN suffers from vanishing gradients. By contrast (c) GCNII and (e) EGNN do not suffer from vanishing gradients, and we can observe their skip connection to ${\bm{H}}^0$. Because these models (GCNII/GCNII-SCT and EGNN/EGNN-SCT) connect ${\bm{H}}^0$ to every layer, the gradient at the first layer is nonzero. We notice that while SCT does not overcome vanishing gradients for (b) GCN-SCT, it is able to increase the norm of the gradients for the intermediate layers in (d) GCNII-SCT and (f) EGNN-SCT.

Theorems & Definitions (31)

  • Proposition 2.1: oono2019graph
  • Definition 2.2: oono2019graph
  • Definition 2.3: cai2020note
  • Proposition 3.1
  • Proposition 3.2: ReLU
  • Proposition 3.3: Leaky ReLU
  • Definition 4.1
  • Remark 4.2
  • Proposition 4.3: ReLU
  • Proposition 4.4: Leaky ReLU
  • ...and 21 more