Table of Contents
Fetching ...

Spectral Graph Pruning Against Over-Squashing and Over-Smoothing

Adarsh Jamadandi, Celia Rubio-Madrigal, Rebekka Burkholz

TL;DR

It is argued that deleting edges can address over-squashing and over-smoothing simultaneously, thus connecting spectral gap optimization to a seemingly disconnected objective of reducing computational resources by pruning graphs for lottery tickets.

Abstract

Message Passing Graph Neural Networks are known to suffer from two problems that are sometimes believed to be diametrically opposed: over-squashing and over-smoothing. The former results from topological bottlenecks that hamper the information flow from distant nodes and are mitigated by spectral gap maximization, primarily, by means of edge additions. However, such additions often promote over-smoothing that renders nodes of different classes less distinguishable. Inspired by the Braess phenomenon, we argue that deleting edges can address over-squashing and over-smoothing simultaneously. This insight explains how edge deletions can improve generalization, thus connecting spectral gap optimization to a seemingly disconnected objective of reducing computational resources by pruning graphs for lottery tickets. To this end, we propose a more effective spectral gap optimization framework to add or delete edges and demonstrate its effectiveness on large heterophilic datasets.

Spectral Graph Pruning Against Over-Squashing and Over-Smoothing

TL;DR

It is argued that deleting edges can address over-squashing and over-smoothing simultaneously, thus connecting spectral gap optimization to a seemingly disconnected objective of reducing computational resources by pruning graphs for lottery tickets.

Abstract

Message Passing Graph Neural Networks are known to suffer from two problems that are sometimes believed to be diametrically opposed: over-squashing and over-smoothing. The former results from topological bottlenecks that hamper the information flow from distant nodes and are mitigated by spectral gap maximization, primarily, by means of edge additions. However, such additions often promote over-smoothing that renders nodes of different classes less distinguishable. Inspired by the Braess phenomenon, we argue that deleting edges can address over-squashing and over-smoothing simultaneously. This insight explains how edge deletions can improve generalization, thus connecting spectral gap optimization to a seemingly disconnected objective of reducing computational resources by pruning graphs for lottery tickets. To this end, we propose a more effective spectral gap optimization framework to add or delete edges and demonstrate its effectiveness on large heterophilic datasets.
Paper Structure (26 sections, 4 theorems, 8 equations, 8 figures, 27 tables, 4 algorithms)

This paper contains 26 sections, 4 theorems, 8 equations, 8 figures, 27 tables, 4 algorithms.

Key Result

Lemma 3.1

Eldan2017BraesssPF: Let $\mathcal{G} = (\mathcal{V},\mathcal{E})$ be a finite graph, with $f$ denoting the eigenvector and $\lambda_1(\mathcal{L}_{\mathcal{G}})$ the eigenvalue corresponding to the spectral gap. Let $\{u,v\} \notin \mathcal{V}$ be two vertices that are not connected by an edge. Deno If $g\left(u,v, \mathcal{L}_{\mathcal{G}}\right) > 0$, then $\lambda_1(\mathcal{L}_{\mathcal{G}}) >

Figures (8)

  • Figure 1: Braess' paradox. We derive a simple example where deleting an edge from $\mathcal{G}$ to obtain $\mathcal{G}^-$ yields a higher spectral gap. Alternatively, we add a single edge to the base graph to either increase (${\mathcal{G}^+}$) or to decrease ($\widetilde{\mathcal{G}^+}$) the spectral gap. The relationship between the four graphs is highlighted by arrows when an edge is added/deleted.
  • Figure 2: We plot the MSE vs order of smoothing for our four synthetic graphs (\ref{['fig:smoothratesynth']}), and for a real heterophilic dataset with the result of different rewiring algorithms to it: FoSR Fosr and ProxyAdd for adding (200 edges), and our ProxyDelete for deleting edges (5 edges) (\ref{['fig:smoothratereal']}). We find that deleting edges helps reduce over-smoothing, while still mitigating over-squashing via the spectral gap increase.
  • Figure 3: We instantiate a toy ER graph with 30 nodes and 58 edges. We compare FoSR Fosr, our proxy spectral gap based methods, and our Eldan's criterion based edge methods.
  • Figure 4: Neighboring configurations on each of the four graphs from Figure \ref{['fig:rings']}.
  • Figure 5: Different configurations of labels/features for the example graphs of Figure \ref{['fig:rings']}, as well as their respective smoothing rate tests akin to Figure \ref{['fig:smoothratesynth']}. Figure \ref{['fig:conf1']} is the original configuration, for direct comparison. Figure \ref{['fig:conf2']} rotates the labels and achieves more intra-class edges. Figure \ref{['fig:conf3']} achieves the same amount of intra-class edges but separates nodes with the same labels. Figure \ref{['fig:confbad']} alternates between classes and is a worse configuration to learn.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Lemma 3.1
  • Proposition 3.2
  • Proposition 3.3
  • Proposition 3.4