Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

Andreas Roth; Thomas Liebig

Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

Andreas Roth, Thomas Liebig

TL;DR

Rank collapse of node representations is identified as the root cause of over-smoothing and over-correlation in graph neural networks. The authors present a theoretical analysis showing that collapse is independent of aggregation and feature transformations, and propose the sum of Kronecker products (SKP) as a property that provably prevents rank collapse. Empirical validation on nine node-classification tasks demonstrates SKP's ability to fit data with depths up to $32$ layers, addressing limitations of traditional message-passing models. The findings suggest a paradigm shift toward preventing rank collapse and highlight the need for normalization-aware metrics to measure rank collapse in graphs.

Abstract

Our study reveals new theoretical insights into over-smoothing and feature over-correlation in graph neural networks. Specifically, we demonstrate that with increased depth, node representations become dominated by a low-dimensional subspace that depends on the aggregation function but not on the feature transformations. For all aggregation functions, the rank of the node representations collapses, resulting in over-smoothing for particular aggregation functions. Our study emphasizes the importance for future research to focus on rank collapse rather than over-smoothing. Guided by our theory, we propose a sum of Kronecker products as a beneficial property that provably prevents over-smoothing, over-correlation, and rank collapse. We empirically demonstrate the shortcomings of existing models in fitting target functions of node classification tasks.

Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

TL;DR

layers, addressing limitations of traditional message-passing models. The findings suggest a paradigm shift toward preventing rank collapse and highlight the need for normalization-aware metrics to measure rank collapse in graphs.

Abstract

Paper Structure (24 sections, 10 theorems, 38 equations, 1 figure, 1 table)

This paper contains 24 sections, 10 theorems, 38 equations, 1 figure, 1 table.

Conclusion
Mathematical details
Basic Operations
Kronecker Product.
Dirichlet Energy
Frobenius Norm
Proof of Proposition 4.1
Proof of Lemma 5.1
Proof of Theorem 5.2
Proof of Proposition 5.3
Proof of Theorem 5.4
Proof of Proposition 5.5
Proof of Proposition 5.6
Proof of Proposition 5.7
Over-correlation
...and 9 more sections

Key Result

Proposition 1

(Node representations vanish.) Let $\mathbf{\Tilde{A}}\in\mathbb{R}^{n\times n}$ be symmetric with maximum absolute eigenvalue $|\lambda_1^{\Tilde{\mathbf{A}}}|=1$, $\mathbf{W}\in\mathbb{R}^{d\times d}$ be any matrix with maximum singular value $\sigma_1^{\mathbf{W}}$, and $\phi$ a component-wise no

Figures (1)

Figure 5: Accuracies and loss dynamics for the synthetic task.

Theorems & Definitions (20)

Proposition
proof
Lemma
proof
Theorem
proof
Proposition
proof
Theorem
proof
...and 10 more

Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

TL;DR

Abstract

Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (20)