Table of Contents
Fetching ...

Energy Guided smoothness to improve Robustness in Graph Classification

Farooq Ahmad Wani, Maria Sofia Bucarelli, Andrea Giuseppe Di Francesco, Oleksandr Pryymak, Fabrizio Silvestri

TL;DR

Graph classification with noisy labels challenges GNNs, but robustness can be tied to the smoothness of learned representations. The authors propose an energy-guided perspective centered on the Dirichlet energy $E^{dir}$, and introduce three smoothing-based strategies: (i) enforcing positive eigenvalues in weight matrices, (ii) directly regularizing Dirichlet energy, and (iii) the GCOD loss that discounts likely noisy samples via centroid-based signals. Across diverse benchmarks, robustness improvements accompany lower or stabilized $E^{dir}$, with GCOD delivering the strongest gains and minimal overhead while preserving performance on clean data. This work suggests that controlling spectral smoothness is a principled path to robust graph classification and potentially extends to domain shift and adversarial settings.

Abstract

Graph Neural Networks (GNNs) are powerful at solving graph classification tasks, yet applied problems often contain noisy labels. In this work, we study GNN robustness to label noise, demonstrate GNN failure modes when models struggle to generalise on low-order graphs, low label coverage, or when a model is over-parameterized. We establish both empirical and theoretical links between GNN robustness and the reduction of the total Dirichlet Energy of learned node representations, which encapsulates the hypothesized GNN smoothness inductive bias. Finally, we introduce two training strategies to enhance GNN robustness: (1) by incorporating a novel inductive bias in the weight matrices through the removal of negative eigenvalues, connected to Dirichlet Energy minimization; (2) by extending to GNNs a loss penalty that promotes learned smoothness. Importantly, neither approach negatively impacts performance in noise-free settings, supporting our hypothesis that the source of GNNs robustness is their smoothness inductive bias.

Energy Guided smoothness to improve Robustness in Graph Classification

TL;DR

Graph classification with noisy labels challenges GNNs, but robustness can be tied to the smoothness of learned representations. The authors propose an energy-guided perspective centered on the Dirichlet energy , and introduce three smoothing-based strategies: (i) enforcing positive eigenvalues in weight matrices, (ii) directly regularizing Dirichlet energy, and (iii) the GCOD loss that discounts likely noisy samples via centroid-based signals. Across diverse benchmarks, robustness improvements accompany lower or stabilized , with GCOD delivering the strongest gains and minimal overhead while preserving performance on clean data. This work suggests that controlling spectral smoothness is a principled path to robust graph classification and potentially extends to domain shift and adversarial settings.

Abstract

Graph Neural Networks (GNNs) are powerful at solving graph classification tasks, yet applied problems often contain noisy labels. In this work, we study GNN robustness to label noise, demonstrate GNN failure modes when models struggle to generalise on low-order graphs, low label coverage, or when a model is over-parameterized. We establish both empirical and theoretical links between GNN robustness and the reduction of the total Dirichlet Energy of learned node representations, which encapsulates the hypothesized GNN smoothness inductive bias. Finally, we introduce two training strategies to enhance GNN robustness: (1) by incorporating a novel inductive bias in the weight matrices through the removal of negative eigenvalues, connected to Dirichlet Energy minimization; (2) by extending to GNNs a loss penalty that promotes learned smoothness. Importantly, neither approach negatively impacts performance in noise-free settings, supporting our hypothesis that the source of GNNs robustness is their smoothness inductive bias.

Paper Structure

This paper contains 43 sections, 1 theorem, 22 equations, 12 figures, 7 tables.

Key Result

Proposition 6.1

Let $\mathcal{D} = \{ \mathcal{G}^1= (\mathbf{Z}^1, \mathbf{A}^1), \ldots, \mathcal{G}^{n} = (\mathbf{Z}^n, \mathbf{A}^n) \}$ be a set of graphs. Then $E^{dir}_\text{set}(\mathcal{D}) = \frac{1}{|\mathcal{D}|} E^{dir}(\mathbf{Z}),$ where $\mathbf{Z} = [\mathbf{Z^1} \| \ldots \| \mathbf{Z^n}]$ and $\

Figures (12)

  • Figure 1: Training accuracy on noisy labels only. Effect of dataset properties: (a) Fewer classes in PPA lead to faster overfitting on noise. (b) Lower graph order leads to faster overfitting on noise.
  • Figure 2: Training accuracy on noisy labels only. Effect of dataset size: (a) Smaller fractions of the PPA dataset lead to stronger noise memorization. (b) Smaller synthetic datasets are also more prone to memorizing noise.
  • Figure 3: (a) Evolution of training Accuracy for GIN model on the PPA dataset (30% sample, 6 classes) with clean 0% or 20% label noise for CE and GCOD . (b) Dirichlet energy for clean and noise introduced PPA dataset (30% sample, 6 classes). The Dirichlet energy increases when the model with CE fits on noise.
  • Figure 4: (c) Dirichlet energy and test Accuracy on the PPA dataset using CE with GIN and GCN models. (d) Dirichlet energy and test Accuracy on different datasets using GIN model (axes scaled for comparison).
  • Figure 5: Evolution of Dirichlet energy across training epochs for the representations $Y$, $Z_1$, and $Z_2$ learned by the FGRL model on the ENZYMES dataset. Solid lines represent training with 0% label noise, while dashed lines correspond to 30% symmetric label noise. The top-left plot shows the normalized energy trajectories for all three representations, with each normalized by its own maximum value to enable direct comparison. The remaining plots display the raw Dirichlet energy for each representation individually, preserving their respective scales to emphasize magnitude differences and noise sensitivity.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Proposition 6.1
  • Remark 6.2