On the Relationship Between Robustness and Expressivity of Graph Neural Networks
Lorenz Kummer, Wilfried N. Gansterer, Nils M. Kriege
TL;DR
The paper studies how BFAs on GNN weights can erode expressivity, i.e., the ability to distinguish non-isomorphic graphs, by linking expressivity to the $1$-WL test and injectivity of neural moment functions. It develops an analytical framework that yields upper bounds on the number of bit flips required to degrade node-level and graph-level expressivity, showing dependencies on architecture, bit width, graph homophily, and feature encoding. Special cases reveal that ReLU-activated GNNs are particularly vulnerable, while activations such as Sigmoid or SiLU improve resilience; first-layer attacks with one-hot features are also highly impactful. Empirical validation on ten real-world TU datasets confirms the theory and provides practical mitigation guidelines, including moving away from one-hot features, densifying inputs, and selecting more robust activations.
Abstract
We investigate the vulnerability of Graph Neural Networks (GNNs) to bit-flip attacks (BFAs) by introducing an analytical framework to study the influence of architectural features, graph properties, and their interaction. The expressivity of GNNs refers to their ability to distinguish non-isomorphic graphs and depends on the encoding of node neighborhoods. We examine the vulnerability of neural multiset functions commonly used for this purpose and establish formal criteria to characterize a GNN's susceptibility to losing expressivity due to BFAs. This enables an analysis of the impact of homophily, graph structural variety, feature encoding, and activation functions on GNN robustness. We derive theoretical bounds for the number of bit flips required to degrade GNN expressivity on a dataset, identifying ReLU-activated GNNs operating on highly homophilous graphs with low-dimensional or one-hot encoded features as particularly susceptible. Empirical results using ten real-world datasets confirm the statistical significance of our key theoretical insights and offer actionable results to mitigate BFA risks in expressivity-critical applications.
