Bayesian Neighborhood Adaptation for Graph Neural Networks
Paribesh Regmi, Rui Li, Kishan K C
TL;DR
This work tackles the challenge of automatically determining the optimal neighborhood scope for message passing in graph neural networks. It introduces a Bayesian framework that treats hop expansion as a nonparametric beta-process prior over infinite neighborhoods, with feature-level masking governed by a conjugate Bernoulli process and variational inference to jointly learn hop contributions and GNN parameters. Theoretical expressivity analyses show that the method mitigates depth-related feature collapse and expands representational capacity, while empirical results demonstrate improved accuracy and well-calibrated predictions across both homophilic and heterophilic graphs, including large-scale datasets and a biomolecular network. The approach is compatible with a range of GNN variants and provides a principled, scalable mechanism for adaptive neighborhood aggregation with uncertainty quantification, enabling more robust and interpretable graph learning.
Abstract
The neighborhood scope (i.e., number of hops) where graph neural networks (GNNs) aggregate information to characterize a node's statistical property is critical to GNNs' performance. Two-stage approaches, training and validating GNNs for every pre-specified neighborhood scope to search for the best setting, is a time-consuming task and tends to be biased due to the search space design. How to adaptively determine proper neighborhood scopes for the aggregation process for both homophilic and heterophilic graphs remains largely unexplored. We thus propose to model the GNNs' message-passing behavior on a graph as a stochastic process by treating the number of hops as a beta process. This Bayesian framework allows us to infer the most plausible neighborhood scope for message aggregation simultaneously with the optimization of GNN parameters. Our theoretical analysis shows that the scope inference improves the expressivity of a GNN. Experiments on benchmark homophilic and heterophilic datasets show that the proposed method is compatible with state-of-the-art GNN variants, achieving competitive or superior performance on the node classification task, and providing well-calibrated predictions.
