Table of Contents
Fetching ...

On the Relationship Between Robustness and Expressivity of Graph Neural Networks

Lorenz Kummer, Wilfried N. Gansterer, Nils M. Kriege

TL;DR

The paper studies how BFAs on GNN weights can erode expressivity, i.e., the ability to distinguish non-isomorphic graphs, by linking expressivity to the $1$-WL test and injectivity of neural moment functions. It develops an analytical framework that yields upper bounds on the number of bit flips required to degrade node-level and graph-level expressivity, showing dependencies on architecture, bit width, graph homophily, and feature encoding. Special cases reveal that ReLU-activated GNNs are particularly vulnerable, while activations such as Sigmoid or SiLU improve resilience; first-layer attacks with one-hot features are also highly impactful. Empirical validation on ten real-world TU datasets confirms the theory and provides practical mitigation guidelines, including moving away from one-hot features, densifying inputs, and selecting more robust activations.

Abstract

We investigate the vulnerability of Graph Neural Networks (GNNs) to bit-flip attacks (BFAs) by introducing an analytical framework to study the influence of architectural features, graph properties, and their interaction. The expressivity of GNNs refers to their ability to distinguish non-isomorphic graphs and depends on the encoding of node neighborhoods. We examine the vulnerability of neural multiset functions commonly used for this purpose and establish formal criteria to characterize a GNN's susceptibility to losing expressivity due to BFAs. This enables an analysis of the impact of homophily, graph structural variety, feature encoding, and activation functions on GNN robustness. We derive theoretical bounds for the number of bit flips required to degrade GNN expressivity on a dataset, identifying ReLU-activated GNNs operating on highly homophilous graphs with low-dimensional or one-hot encoded features as particularly susceptible. Empirical results using ten real-world datasets confirm the statistical significance of our key theoretical insights and offer actionable results to mitigate BFA risks in expressivity-critical applications.

On the Relationship Between Robustness and Expressivity of Graph Neural Networks

TL;DR

The paper studies how BFAs on GNN weights can erode expressivity, i.e., the ability to distinguish non-isomorphic graphs, by linking expressivity to the -WL test and injectivity of neural moment functions. It develops an analytical framework that yields upper bounds on the number of bit flips required to degrade node-level and graph-level expressivity, showing dependencies on architecture, bit width, graph homophily, and feature encoding. Special cases reveal that ReLU-activated GNNs are particularly vulnerable, while activations such as Sigmoid or SiLU improve resilience; first-layer attacks with one-hot features are also highly impactful. Empirical validation on ten real-world TU datasets confirms the theory and provides practical mitigation guidelines, including moving away from one-hot features, densifying inputs, and selecting more robust activations.

Abstract

We investigate the vulnerability of Graph Neural Networks (GNNs) to bit-flip attacks (BFAs) by introducing an analytical framework to study the influence of architectural features, graph properties, and their interaction. The expressivity of GNNs refers to their ability to distinguish non-isomorphic graphs and depends on the encoding of node neighborhoods. We examine the vulnerability of neural multiset functions commonly used for this purpose and establish formal criteria to characterize a GNN's susceptibility to losing expressivity due to BFAs. This enables an analysis of the impact of homophily, graph structural variety, feature encoding, and activation functions on GNN robustness. We derive theoretical bounds for the number of bit flips required to degrade GNN expressivity on a dataset, identifying ReLU-activated GNNs operating on highly homophilous graphs with low-dimensional or one-hot encoded features as particularly susceptible. Empirical results using ten real-world datasets confirm the statistical significance of our key theoretical insights and offer actionable results to mitigate BFA risks in expressivity-critical applications.

Paper Structure

This paper contains 34 sections, 4 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Visualization of $\mu \pm \sigma$ of the metric $M^{\Pi_1^k}_{GNN, {D}}$ (bottom) computed from the 20250 unperturbed (clean) models used in the attack runs for ReLU, Sigmoid, SiLU (top), aggregated across 3-layer GIN/GCN and DS.
  • Figure 2: Significance levels of Spearman correlation of dataset properties $S_{WL, {D}}^{(k)}$, $H_{{D}}$ and feature dimensionality with $\Delta Exp$ computed across all runs and datasets for exponent, sign and mantissa target bits in Sigmoid, ReLU and SiLU activated 3-layer GIN/GCNs and DS (101250 runs in total).
  • Figure 3: Flips in exponent, sign and mantissa bits of Sigmoid, ReLU and SiLU activated 3-layer GIN/GCNs and DS. Each group of $10$ bars represents expressivity (i.e. the fraction of distinguishable non-isomorphic graphs of the given dataset) after (left to right) 1% to 95% bits flipped in a certain component of the FLOAT32 representation of the weights in a layer. Each bar represents $\mu \pm \sigma$ computed from 25 runs (101250 runs in total).