Table of Contents
Fetching ...

Towards Scalable and Deep Graph Neural Networks via Noise Masking

Yuxuan Liang, Wentao Zhang, Zeang Sheng, Ling Yang, Quanqing Xu, Jiawei Jiang, Yunhai Tong, Bin Cui

TL;DR

This work presents random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works that enables the exploration of deeper GNNs while preserving their scalability.

Abstract

In recent years, Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. However, scaling them to large graphs is challenging due to the high computational and storage costs of repeated feature propagation and non-linear transformation during training. One commonly employed approach to address this challenge is model-simplification, which only executes the Propagation (P) once in the pre-processing, and Combine (C) these receptive fields in different ways and then feed them into a simple model for better performance. Despite their high predictive performance and scalability, these methods still face two limitations. First, existing approaches mainly focus on exploring different C methods from the model perspective, neglecting the crucial problem of performance degradation with increasing P depth from the data-centric perspective, known as the over-smoothing problem. Second, pre-processing overhead takes up most of the end-to-end processing time, especially for large-scale graphs. To address these limitations, we present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works. This module enables the exploration of deeper GNNs while preserving their scalability. Unlike the previous model-simplification works, we focus on continuous P and found that the noise existing inside each P is the cause of the over-smoothing issue, and use the efficient masking mechanism to eliminate them. Experimental results on six real-world datasets demonstrate that model-simplification works equipped with RMask yield superior performance compared to their original version and can make a good trade-off between accuracy and efficiency.

Towards Scalable and Deep Graph Neural Networks via Noise Masking

TL;DR

This work presents random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works that enables the exploration of deeper GNNs while preserving their scalability.

Abstract

In recent years, Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. However, scaling them to large graphs is challenging due to the high computational and storage costs of repeated feature propagation and non-linear transformation during training. One commonly employed approach to address this challenge is model-simplification, which only executes the Propagation (P) once in the pre-processing, and Combine (C) these receptive fields in different ways and then feed them into a simple model for better performance. Despite their high predictive performance and scalability, these methods still face two limitations. First, existing approaches mainly focus on exploring different C methods from the model perspective, neglecting the crucial problem of performance degradation with increasing P depth from the data-centric perspective, known as the over-smoothing problem. Second, pre-processing overhead takes up most of the end-to-end processing time, especially for large-scale graphs. To address these limitations, we present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works. This module enables the exploration of deeper GNNs while preserving their scalability. Unlike the previous model-simplification works, we focus on continuous P and found that the noise existing inside each P is the cause of the over-smoothing issue, and use the efficient masking mechanism to eliminate them. Experimental results on six real-world datasets demonstrate that model-simplification works equipped with RMask yield superior performance compared to their original version and can make a good trade-off between accuracy and efficiency.

Paper Structure

This paper contains 18 sections, 5 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Example of two limitations in model-simplification GNNs.
  • Figure 2: Observation and our insight.
  • Figure 3: The architecture of proposed RMask.
  • Figure 4: Test accuracy on ogbn-arxiv along the propagation steps with two model-simplification GNNs.
  • Figure 5: (a) Trade-off between efficiency and accuracy on ogbn-products. (b) Time cost breakdown. (c) Speedup analysis.
  • ...and 1 more figures