Table of Contents
Fetching ...

Joint Diffusion Processes as an Inductive Bias in Sheaf Neural Networks

Ferran Hernandez Caralt, Guillermo Bernárdez Gil, Iulia Duta, Pietro Liò, Eduard Alarcón Cot

TL;DR

This paper addresses the failure modes of GNNs under heterophily and oversmoothing by leveraging Sheaf Neural Networks (SNNs), which attach a cellular sheaf to the graph. It introduces two opinion-dynamics–inspired variants (Joint Diffusion and Rotation-Invariant SNNs) that impose a heterophily-friendly inductive bias and reduce parameter counts, along with a dual diffusion mechanism that evolves both node features and restriction maps. A novel synthetic ellipsoid-based benchmark and a controlled Watts–Strogatz–style edge-generation pipeline are proposed to evaluate heterophily handling and data efficiency, with extensive experiments showing competitive performance and insightful behavior under noise and data scarcity. The work highlights practical benefits for scenarios with limited data or feature dimensionality and opens avenues for federated and geometric extensions of SNNs.

Abstract

Sheaf Neural Networks (SNNs) naturally extend Graph Neural Networks (GNNs) by endowing a cellular sheaf over the graph, equipping nodes and edges with vector spaces and defining linear mappings between them. While the attached geometric structure has proven to be useful in analyzing heterophily and oversmoothing, so far the methods by which the sheaf is computed do not always guarantee a good performance in such settings. In this work, drawing inspiration from opinion dynamics concepts, we propose two novel sheaf learning approaches that (i) provide a more intuitive understanding of the involved structure maps, (ii) introduce a useful inductive bias for heterophily and oversmoothing, and (iii) infer the sheaf in a way that does not scale with the number of features, thus using fewer learnable parameters than existing methods. In our evaluation, we show the limitations of the real-world benchmarks used so far on SNNs, and design a new synthetic task -- leveraging the symmetries of n-dimensional ellipsoids -- that enables us to better assess the strengths and weaknesses of sheaf-based models. Our extensive experimentation on these novel datasets reveals valuable insights into the scenarios and contexts where SNNs in general -- and our proposed approaches in particular -- can be beneficial.

Joint Diffusion Processes as an Inductive Bias in Sheaf Neural Networks

TL;DR

This paper addresses the failure modes of GNNs under heterophily and oversmoothing by leveraging Sheaf Neural Networks (SNNs), which attach a cellular sheaf to the graph. It introduces two opinion-dynamics–inspired variants (Joint Diffusion and Rotation-Invariant SNNs) that impose a heterophily-friendly inductive bias and reduce parameter counts, along with a dual diffusion mechanism that evolves both node features and restriction maps. A novel synthetic ellipsoid-based benchmark and a controlled Watts–Strogatz–style edge-generation pipeline are proposed to evaluate heterophily handling and data efficiency, with extensive experiments showing competitive performance and insightful behavior under noise and data scarcity. The work highlights practical benefits for scenarios with limited data or feature dimensionality and opens avenues for federated and geometric extensions of SNNs.

Abstract

Sheaf Neural Networks (SNNs) naturally extend Graph Neural Networks (GNNs) by endowing a cellular sheaf over the graph, equipping nodes and edges with vector spaces and defining linear mappings between them. While the attached geometric structure has proven to be useful in analyzing heterophily and oversmoothing, so far the methods by which the sheaf is computed do not always guarantee a good performance in such settings. In this work, drawing inspiration from opinion dynamics concepts, we propose two novel sheaf learning approaches that (i) provide a more intuitive understanding of the involved structure maps, (ii) introduce a useful inductive bias for heterophily and oversmoothing, and (iii) infer the sheaf in a way that does not scale with the number of features, thus using fewer learnable parameters than existing methods. In our evaluation, we show the limitations of the real-world benchmarks used so far on SNNs, and design a new synthetic task -- leveraging the symmetries of n-dimensional ellipsoids -- that enables us to better assess the strengths and weaknesses of sheaf-based models. Our extensive experimentation on these novel datasets reveals valuable insights into the scenarios and contexts where SNNs in general -- and our proposed approaches in particular -- can be beneficial.
Paper Structure (27 sections, 6 theorems, 17 equations, 8 figures, 1 table)

This paper contains 27 sections, 6 theorems, 17 equations, 8 figures, 1 table.

Key Result

Proposition 2.1

Let $G=(V,E)$ be a graph with an associated graph Laplacian $L$ and $X(t)$ a node signal satisfying eq:RegGraphDiff, then $lim_{t \rightarrow \infty} X(t)$ is constant on all connected components. Proof in Appendix appendix:proof_oversmoothing.

Figures (8)

  • Figure 1: Visual representations of (\ref{['subfig:SheafMotivA']}) a pair-wise interaction modelled by a graph, (\ref{['subfig:SheafMotivB']}) a pair-wise interaction modelled by a sheaf over a graph, and (\ref{['subfig:SheafMotivC']}) a higher-order interaction modelled by a sheaf over a hypergraph.
  • Figure 2: Top: a diagram representing Regular Sheaf Diffusion. Bottom: a representation of the Learning to Lie ODE diffusion.
  • Figure 3: A visualization of the solutions of $zx = ty$, with $t=1$ because the ones with $t=0$ are trivially $t=0$ and $y=0$. This is, in fact, an affine projection of a projective variety. This corresponds with the set of equilibrium points of Eqs. (\ref{['eq:JointDiff']}) in the case of 1-dimensional stalks and a graph with only 2 adjacent nodes, so $z,t$ would be the restriction maps while $x,y$ would be the node's features. Sheaf diffusion only evolves $x,y$, so it may miss close points of equilibrium in the $z$ axis.
  • Figure 4: Graphic plot of two classes, red and blue, generated by sampling a 2-dimensional ellipsoid's surface. In this representation, it's easy to see how they're not linearly separable but distinct, and that the expected value for both of them is zero.
  • Figure 5: JdSNNs variants' accuracy results with a 95% confidence interval when increasing the percentage of feature Gaussian noise in the data. We may observe how our methods are more robust to noise than regular SNNs. We can also observe how GCNs and MLPs underfit the data.
  • ...and 3 more figures

Theorems & Definitions (16)

  • Proposition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Definition 2.5
  • Definition 2.6
  • Definition 2.7
  • Proposition 2.8
  • Remark 2.9
  • Remark 3.1
  • ...and 6 more