Table of Contents
Fetching ...

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

Yu Rong, Wenbing Huang, Tingyang Xu, Junzhou Huang

TL;DR

This paper tackles over-fitting and over-smoothing in deep Graph Convolutional Networks for node classification. It introduces DropEdge, a training-time edge dropout mechanism that acts as data augmentation and a regulator of message passing, with theory showing it slows smoothing and reduces information loss. The method is general across backbones (GCN, ResGCN, JKNet, IncepGCN, GraphSAGE) and yields consistent improvements on Cora, Citeseer, PubMed, and Reddit, including very deep architectures, with visual evidence supporting smoothing mitigation. Overall, DropEdge provides a simple, scalable tool to enable deeper GCNs with practical performance gains across diverse graph tasks.

Abstract

\emph{Over-fitting} and \emph{over-smoothing} are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. This paper proposes DropEdge, a novel and flexible technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graph at each training epoch, acting like a data augmenter and also a message passing reducer. Furthermore, we theoretically demonstrate that DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by it. More importantly, our DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance. Extensive experiments on several benchmarks verify that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs. The effect of DropEdge on preventing over-smoothing is empirically visualized and validated as well. Codes are released on~\url{https://github.com/DropEdge/DropEdge}.

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

TL;DR

This paper tackles over-fitting and over-smoothing in deep Graph Convolutional Networks for node classification. It introduces DropEdge, a training-time edge dropout mechanism that acts as data augmentation and a regulator of message passing, with theory showing it slows smoothing and reduces information loss. The method is general across backbones (GCN, ResGCN, JKNet, IncepGCN, GraphSAGE) and yields consistent improvements on Cora, Citeseer, PubMed, and Reddit, including very deep architectures, with visual evidence supporting smoothing mitigation. Overall, DropEdge provides a simple, scalable tool to enable deeper GCNs with practical performance gains across diverse graph tasks.

Abstract

\emph{Over-fitting} and \emph{over-smoothing} are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. This paper proposes DropEdge, a novel and flexible technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graph at each training epoch, acting like a data augmenter and also a message passing reducer. Furthermore, we theoretically demonstrate that DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by it. More importantly, our DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance. Extensive experiments on several benchmarks verify that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs. The effect of DropEdge on preventing over-smoothing is empirically visualized and validated as well. Codes are released on~\url{https://github.com/DropEdge/DropEdge}.

Paper Structure

This paper contains 38 sections, 3 theorems, 11 equations, 7 figures, 7 tables.

Key Result

Theorem 1

We denote the original graph as ${\mathcal{G}}$ and the one after dropping certain edges out as ${\mathcal{G}}'$. Given a small value of $\epsilon$, we assume ${\mathcal{G}}$ and ${\mathcal{G}}'$ will encounter the $\epsilon$-smoothing issue with regard to subspaces ${\mathcal{M}}$ and ${\mathcal{M}

Figures (7)

  • Figure 1: Performance of Multi-layer GCNs on Cora. We implement 4-layer GCN w and w/o DropEdge (in orange), 8-layer GCN w and w/o DropEdge (in blue). GCN-4 gets stuck in the over-fitting issue attaining low training error but high validation error; the training of GCN-8 fails to converge satisfactorily due to over-smoothing. By applying DropEdge, both GCN-4 and GCN-8 work well for both training and validation.
  • Figure 2:
  • Figure 3: Analysis on over-smoothing. Smaller distance means more serious over-smoothing.
  • Figure 4:
  • Figure 5: The illustration of four backbones. GCL indicates graph convolutional layer.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Definition 1: subspace
  • Definition 2: $\epsilon$-smoothing
  • Definition 3: the $\epsilon$-smoothing layer
  • Definition 4: the relaxed $\epsilon$-smoothing layer
  • Theorem 1
  • Corollary 1
  • Lemma 2
  • proof
  • proof