Entropy Aware Message Passing in Graph Neural Networks
Philipp Nazari, Oliver Lemke, Davide Guidobene, Artiom Gesp
TL;DR
This work tackles oversmoothing in deep Graph Neural Networks by introducing an entropy-aware message passing mechanism. It defines node energies $E_i$ via Dirichlet energy and builds an unnormalized Boltzmann distribution $p_i = e^{-E_i/T}$, with Shannon entropy $S$ guiding a gradient ascent term added to layer updates; a closed-form gradient $\nabla_{\mathbf X_i} S$ ensures efficient computation with $\mathcal{O}(m+n)$ complexity. The method is architecture-agnostic and demonstrated on standard benchmarks, showing comparable mitigation of oversmoothing to existing baselines while highlighting that alleviating oversmoothing alone does not guarantee state-of-the-art deep-network accuracy. Hyperparameter sensitivity to task and a flexible implementation are emphasized, with code provided for replication. Overall, the approach offers a physics-inspired, scalable regularization that can be integrated with various GNN designs to preserve embedding entropy during learning.
Abstract
Deep Graph Neural Networks struggle with oversmoothing. This paper introduces a novel, physics-inspired GNN model designed to mitigate this issue. Our approach integrates with existing GNN architectures, introducing an entropy-aware message passing term. This term performs gradient ascent on the entropy during node aggregation, thereby preserving a certain degree of entropy in the embeddings. We conduct a comparative analysis of our model against state-of-the-art GNNs across various common datasets.
