Table of Contents
Fetching ...

Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective

Yushun Dong, Patrick Soga, Yinhan He, Song Wang, Jundong Li

TL;DR

The paper questions the dominance of neighborhood aggregation in shaping GNNs' spectral behavior and introduces a comprehensive benchmark to measure how GNNs capture and manipulate information across Laplacian frequency components. Through an exploratory study and a rigorously designed evaluation protocol with a theoretical foundation, the work demonstrates that non-linear and other modules enable GNNs to flexibly produce target frequency components even when inputs lack them, challenging the low-pass filter narrative. The benchmark uses Frequency-domain energy incentives, the Energy Distribution Field concept, and the Normalized AUAC metric to compare 14 GNNs across multiple real-world datasets, revealing consistent spectral patterns (e.g., V-shaped accuracy curves) and model-specific spectral strengths. The findings have practical implications for model selection and architecture design, suggesting a holistic spectral analysis framework can guide improvements beyond traditional filter-centric views.

Abstract

Graph Neural Networks (GNNs) have achieved remarkable success in various graph-based learning tasks. While their performance is often attributed to the powerful neighborhood aggregation mechanism, recent studies suggest that other components such as non-linear layers may also significantly affecting how GNNs process the input graph data in the spectral domain. Such evidence challenges the prevalent opinion that neighborhood aggregation mechanisms dominate the behavioral characteristics of GNNs in the spectral domain. To demystify such a conflict, this paper introduces a comprehensive benchmark to measure and evaluate GNNs' capability in capturing and leveraging the information encoded in different frequency components of the input graph data. Specifically, we first conduct an exploratory study demonstrating that GNNs can flexibly yield outputs with diverse frequency components even when certain frequencies are absent or filtered out from the input graph data. We then formulate a novel research problem of measuring and benchmarking the performance of GNNs from a spectral perspective. To take an initial step towards a comprehensive benchmark, we design an evaluation protocol supported by comprehensive theoretical analysis. Finally, we introduce a comprehensive benchmark on real-world datasets, revealing insights that challenge prevalent opinions from a spectral perspective. We believe that our findings will open new avenues for future advancements in this area. Our implementations can be found at: https://github.com/yushundong/Spectral-benchmark.

Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective

TL;DR

The paper questions the dominance of neighborhood aggregation in shaping GNNs' spectral behavior and introduces a comprehensive benchmark to measure how GNNs capture and manipulate information across Laplacian frequency components. Through an exploratory study and a rigorously designed evaluation protocol with a theoretical foundation, the work demonstrates that non-linear and other modules enable GNNs to flexibly produce target frequency components even when inputs lack them, challenging the low-pass filter narrative. The benchmark uses Frequency-domain energy incentives, the Energy Distribution Field concept, and the Normalized AUAC metric to compare 14 GNNs across multiple real-world datasets, revealing consistent spectral patterns (e.g., V-shaped accuracy curves) and model-specific spectral strengths. The findings have practical implications for model selection and architecture design, suggesting a holistic spectral analysis framework can guide improvements beyond traditional filter-centric views.

Abstract

Graph Neural Networks (GNNs) have achieved remarkable success in various graph-based learning tasks. While their performance is often attributed to the powerful neighborhood aggregation mechanism, recent studies suggest that other components such as non-linear layers may also significantly affecting how GNNs process the input graph data in the spectral domain. Such evidence challenges the prevalent opinion that neighborhood aggregation mechanisms dominate the behavioral characteristics of GNNs in the spectral domain. To demystify such a conflict, this paper introduces a comprehensive benchmark to measure and evaluate GNNs' capability in capturing and leveraging the information encoded in different frequency components of the input graph data. Specifically, we first conduct an exploratory study demonstrating that GNNs can flexibly yield outputs with diverse frequency components even when certain frequencies are absent or filtered out from the input graph data. We then formulate a novel research problem of measuring and benchmarking the performance of GNNs from a spectral perspective. To take an initial step towards a comprehensive benchmark, we design an evaluation protocol supported by comprehensive theoretical analysis. Finally, we introduce a comprehensive benchmark on real-world datasets, revealing insights that challenge prevalent opinions from a spectral perspective. We believe that our findings will open new avenues for future advancements in this area. Our implementations can be found at: https://github.com/yushundong/Spectral-benchmark.

Paper Structure

This paper contains 22 sections, 6 theorems, 6 equations, 14 figures, 5 tables.

Key Result

Theorem 4.2

The energy distribution function $e(v)=\frac{(\mathbf{A}v)\odot (\mathbf{A}v)}{||\mathbf{A}v||^2}$, with $\mathbf{A}$ being orthonormal, is Lipschitz on the unit sphere.

Figures (14)

  • Figure 1: A comparison between the energy of input and output frequency components of GCN on Co-author CS dataset. The results show that the output frequency components can always flexibly align with the target distribution in both cases of (a) inputting low-frequency components only but aiming to output high-frequency components; and (b) inputting high-frequency components only but aiming to output low-frequency components.
  • Figure 2: The accuracy curves of different GNNs in the whole spectral domain. In each subplot, the $x$-axis represents the frequency and the $y$-axis represents the accuracy of GNNs in the node classification task with the ground truth labels derived from the associated frequency bin.
  • Figure 3: Performance comparison in the average ranking of 14 GNNs on six real-world datasets. The GNNs are shown by obtaining the best rankings on low frequency components (left), on middle frequency components (middle), and on high frequency components (right).
  • Figure 4: Kendall's $\tau$ comparison on new datasets between a random ranking (orange), the rankings from the original graph learning task (blue), and the average rankings from our benchmark (green).
  • Figure 5: Accuracy curves in the spectral domain across different GNNs on the Coauthor-CS dataset. The shape of these curves does not significantly change across different layer numbers, validating the stability of the proposed evaluation protocol.
  • ...and 9 more figures

Theorems & Definitions (11)

  • Definition 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Theorem 4.4
  • Definition B.1
  • Theorem B.2
  • proof
  • Theorem B.3
  • proof
  • Theorem B.4
  • ...and 1 more