Table of Contents
Fetching ...

Demystifying MPNNs: Message Passing as Merely Efficient Matrix Multiplication

Qin Jiang, Chengjia Wang, Michael Lones, Wei Pang

TL;DR

This work reframes MPNNs as memory-efficient implementations of matrix multiplication, showing that a $k$-layer GNN effectively aggregates from the $k$-hop neighborhood and is approximately equivalent to a single-layer model on $A^k$. It analyzes how different loop structures (self-loops, two-node loops, multi-node loops) alter $k$-hop connectivity and how this density impacts learning, challenging the notion that deeper GNNs fail solely due to over-smoothing. The authors reveal a structure–feature dichotomy: many datasets are, in effect, structure-only when features are uniform, with node degrees acting as the embedded feature; normalization schemes drastically influence information propagation and model behavior. They further argue that gradient-related issues, not just over-smoothing, largely explain performance degradation in sparse graphs, offering practical guidance on directed vs. undirected aggregation and normalization choices for robust GNN design and deployment.

Abstract

While Graph Neural Networks (GNNs) have achieved remarkable success, their design largely relies on empirical intuition rather than theoretical understanding. In this paper, we present a comprehensive analysis of GNN behavior through three fundamental aspects: (1) we establish that \textbf{$k$-layer} Message Passing Neural Networks efficiently aggregate \textbf{$k$-hop} neighborhood information through iterative computation, (2) analyze how different loop structures influence neighborhood computation, and (3) examine behavior across structure-feature hybrid and structure-only tasks. For deeper GNNs, we demonstrate that gradient-related issues, rather than just over-smoothing, can significantly impact performance in sparse graphs. We also analyze how different normalization schemes affect model performance and how GNNs make predictions with uniform node features, providing a theoretical framework that bridges the gap between empirical success and theoretical understanding.

Demystifying MPNNs: Message Passing as Merely Efficient Matrix Multiplication

TL;DR

This work reframes MPNNs as memory-efficient implementations of matrix multiplication, showing that a -layer GNN effectively aggregates from the -hop neighborhood and is approximately equivalent to a single-layer model on . It analyzes how different loop structures (self-loops, two-node loops, multi-node loops) alter -hop connectivity and how this density impacts learning, challenging the notion that deeper GNNs fail solely due to over-smoothing. The authors reveal a structure–feature dichotomy: many datasets are, in effect, structure-only when features are uniform, with node degrees acting as the embedded feature; normalization schemes drastically influence information propagation and model behavior. They further argue that gradient-related issues, not just over-smoothing, largely explain performance degradation in sparse graphs, offering practical guidance on directed vs. undirected aggregation and normalization choices for robust GNN design and deployment.

Abstract

While Graph Neural Networks (GNNs) have achieved remarkable success, their design largely relies on empirical intuition rather than theoretical understanding. In this paper, we present a comprehensive analysis of GNN behavior through three fundamental aspects: (1) we establish that \textbf{-layer} Message Passing Neural Networks efficiently aggregate \textbf{-hop} neighborhood information through iterative computation, (2) analyze how different loop structures influence neighborhood computation, and (3) examine behavior across structure-feature hybrid and structure-only tasks. For deeper GNNs, we demonstrate that gradient-related issues, rather than just over-smoothing, can significantly impact performance in sparse graphs. We also analyze how different normalization schemes affect model performance and how GNNs make predictions with uniform node features, providing a theoretical framework that bridges the gap between empirical success and theoretical understanding.

Paper Structure

This paper contains 55 sections, 13 theorems, 28 equations, 8 figures, 3 tables.

Key Result

Lemma 2.3

For a graph $G=(V,E)$ with adjacency matrix A and node feature matrix X, the features aggregated from p-hop neighbors of each node are equivalent to the $k$th order node feature $A^kX$.

Figures (8)

  • Figure 1: A $k$-layer GCN without adding selfloop will only gather information from $k$-hop neibors.
  • Figure 2: Types of loops in graphs: (a) self-loop, (b) loop with two nodes connected by an undirected edge, (c) and (d) are examples of n-node loops where n=3 and n=4 respectively.
  • Figure 3: Performance of Unidirectional GCN Without Self-loops on Chameleon and Squirrel Datasets: Model demonstrates stable accuracy up to 50 layers, with deeper architectures constrained by memory limitations. The solid line represents mean accuracy, while the shaded region indicates standard deviation across 10 data splits.
  • Figure 4: Comparison of different GCN architectures on three datasets: $k$-layer GCN (blue), $1$-layer GCN with $k$-hop neighbors (red), and $k$-hop neighbors with $1$-layer GCN and ($k$-1) linear layers (green). The black line shows the density of $k$-hop adjacency matrix.
  • Figure 5: Comparison of different GCN architectures on CiteSeer dataset under different adjacency matrix formulations. (Top) Using transposed adjacency matrix $A^T$, which propagates information from cited papers to citing papers. (Bottom) Using undirected graph adjacency matrix $A + A^T$, which enables bidirectional information flow. In each subplot: k-layer GCN (blue), 1-layer GCN with k-hop neighbors (red), and k-hop neighbors with (k-1) linear layers (green). The black line indicates the density of the k-hop adjacency matrix.
  • ...and 3 more figures

Theorems & Definitions (25)

  • Definition 2.1
  • Definition 2.2
  • Lemma 2.3
  • Remark 2.4
  • Lemma 2.5
  • Remark 2.6
  • Lemma 2.7
  • Lemma 2.8
  • Lemma 2.9
  • Lemma 3.1
  • ...and 15 more