Superposition in Graph Neural Networks
Lukas Pertl, Han Xuanyuan, Pietro Liò
TL;DR
This work provides a representation-centric framework to study superposition in graph neural networks, quantifying how many independent axes are used and how tightly directions are packed in node and graph latent spaces. By extracting linear-probe directions and class-conditional centroids on held-out data, and applying metrics such as EffRank, SI, and Welch-normalized overlap, it reveals geometry-driven effects of width and pooling across GCN, GIN, and GAT. The findings show a three-phase width-dependent behavior, topology-induced node-level entanglement that pooling can re-mix into task axes, and metastable low-rank embeddings in shallow models. Practically, the results offer design guidance for more interpretable GNNs, linking architectural choices to concrete representational geometry outcomes.
Abstract
Interpreting graph neural networks (GNNs) is difficult because message passing mixes signals and internal channels rarely align with human concepts. We study superposition, the sharing of directions by multiple features, directly in the latent space of GNNs. Using controlled experiments with unambiguous graph concepts, we extract features as (i) class-conditional centroids at the graph level and (ii) linear-probe directions at the node level, and then analyze their geometry with simple basis-invariant diagnostics. Across GCN/GIN/GAT we find: increasing width produces a phase pattern in overlap; topology imprints overlap onto node-level features that pooling partially remixes into task-aligned graph axes; sharper pooling increases axis alignment and reduces channel sharing; and shallow models can settle into metastable low-rank embeddings. These results connect representational geometry with concrete design choices (width, pooling, and final-layer activations) and suggest practical approaches for more interpretable GNNs.
