On the Necessity of Learnable Sheaf Laplacians

Ferran Hernandez Caralt; Mar Gonzàlez i Català; Adrián Bazaga; Pietro Liò

On the Necessity of Learnable Sheaf Laplacians

Ferran Hernandez Caralt, Mar Gonzàlez i Català, Adrián Bazaga, Pietro Liò

TL;DR

The Rayleigh quotient is introduced as a normalized measure for comparing oversmoothing across models and shown that, in trained networks, the behavior predicted by the diffusion-based analysis of SNNs is not reflected empirically.

Abstract

Sheaf Neural Networks (SNNs) were introduced as an extension of Graph Convolutional Networks to address oversmoothing on heterophilous graphs by attaching a sheaf to the input graph and replacing the adjacency-based operator with a sheaf Laplacian defined by (learnable) restriction maps. Prior work motivates this design through theoretical properties of sheaf diffusion and the kernel of the sheaf Laplacian, suggesting that suitable non-identity restriction maps can avoid representations converging to constants across connected components. Since oversmoothing can also be mitigated through residual connections and normalization, we revisit a trivial sheaf construction to ask whether the additional complexity of learning restriction maps is necessary. We introduce an Identity Sheaf Network baseline, where all restriction maps are fixed to the identity, and use it to ablate the empirical improvements reported by sheaf-learning architectures. Across five popular heterophilic benchmarks, the identity baseline achieves comparable performance to a range of SNN variants. Finally, we introduce the Rayleigh quotient as a normalized measure for comparing oversmoothing across models and show that, in trained networks, the behavior predicted by the diffusion-based analysis of SNNs is not reflected empirically. In particular, Identity Sheaf Networks do not appear to suffer more significant oversmoothing than their SNN counterparts.

On the Necessity of Learnable Sheaf Laplacians

TL;DR

Abstract

Paper Structure (9 sections, 2 theorems, 2 figures, 3 tables)

This paper contains 9 sections, 2 theorems, 2 figures, 3 tables.

Introduction
Background
Identity Sheaf Networks
Heterophily analysis
Oversmoothing analysis
Conclusions and Future Work
Appendix: Results on Film Dataset
Appendix: Code
Appendix: Hyperparameter Configurations

Key Result

Proposition 2.2

The solution to the ODE $\frac{dX(t)}{dt} = \Delta_G X(t)$ satisfies $\lim_{t\rightarrow \infty} X(t) \in \ker(\Delta_G) = \{x_u = x_v | (u,v) \in E \}$. In the limit, the solution is constant across connected components.

Figures (2)

Figure 1: A series of line plots, one for each dataset-neural network combination, representing the averaged (across folds) $R_{\Delta_\mathcal{F}}$ in red and $R_{\Delta_\mathcal{I}}$ in blue at each layer. These results contradict the theoretical understanding proposed by bodnar2022neural since according to Hypothesis \ref{['hip:decrease']}, the blue line should be above the red line.
Figure 2: Rayleigh Quotients at the first and only layer of a trained SNN and ISN. $R_{\Delta_\mathcal{F}}$ in red and $R_{\Delta_\mathcal{I}}$ in blue at each layer.

Theorems & Definitions (11)

Definition 2.1
Proposition 2.2
Definition 2.3
Definition 2.4
Proposition 2.5
Definition 2.6
Definition 3.1
Definition 3.2
Definition 4.1
Definition 4.2
...and 1 more

On the Necessity of Learnable Sheaf Laplacians

TL;DR

Abstract

On the Necessity of Learnable Sheaf Laplacians

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (11)