Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning

Haitz Sáez de Ocáriz Borde; Artem Lukoianov; Anastasis Kratsios; Michael Bronstein; Xiaowen Dong

Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning

Haitz Sáez de Ocáriz Borde, Artem Lukoianov, Anastasis Kratsios, Michael Bronstein, Xiaowen Dong

TL;DR

This work proposes Scalable Message Passing Neural Networks (SMPNNs) and demonstrates that, by integrating standard convolutional message passing into a Pre-Layer Normalization Transformer-style block instead of attention, this modification yields high-performing deep message-passing-based Graph Neural Networks (GNNs).

Abstract

We propose Scalable Message Passing Neural Networks (SMPNNs) and demonstrate that, by integrating standard convolutional message passing into a Pre-Layer Normalization Transformer-style block instead of attention, we can produce high-performing deep message-passing-based Graph Neural Networks (GNNs). This modification yields results competitive with the state-of-the-art in large graph transductive learning, particularly outperforming the best Graph Transformers in the literature, without requiring the otherwise computationally and memory-expensive attention mechanism. Our architecture not only scales to large graphs but also makes it possible to construct deep message-passing networks, unlike simple GNNs, which have traditionally been constrained to shallow architectures due to oversmoothing. Moreover, we provide a new theoretical analysis of oversmoothing based on universal approximation which we use to motivate SMPNNs. We show that in the context of graph convolutions, residual connections are necessary for maintaining the universal approximation properties of downstream learners and that removing them can lead to a loss of universality.

Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning

TL;DR

Abstract

Scalable Message Passing Neural Networks: No Need for Attention in Large Graph Representation Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (14)