Bayesian Neighborhood Adaptation for Graph Neural Networks

Paribesh Regmi; Rui Li; Kishan K C

Bayesian Neighborhood Adaptation for Graph Neural Networks

Paribesh Regmi, Rui Li, Kishan K C

TL;DR

This work tackles the challenge of automatically determining the optimal neighborhood scope for message passing in graph neural networks. It introduces a Bayesian framework that treats hop expansion as a nonparametric beta-process prior over infinite neighborhoods, with feature-level masking governed by a conjugate Bernoulli process and variational inference to jointly learn hop contributions and GNN parameters. Theoretical expressivity analyses show that the method mitigates depth-related feature collapse and expands representational capacity, while empirical results demonstrate improved accuracy and well-calibrated predictions across both homophilic and heterophilic graphs, including large-scale datasets and a biomolecular network. The approach is compatible with a range of GNN variants and provides a principled, scalable mechanism for adaptive neighborhood aggregation with uncertainty quantification, enabling more robust and interpretable graph learning.

Abstract

The neighborhood scope (i.e., number of hops) where graph neural networks (GNNs) aggregate information to characterize a node's statistical property is critical to GNNs' performance. Two-stage approaches, training and validating GNNs for every pre-specified neighborhood scope to search for the best setting, is a time-consuming task and tends to be biased due to the search space design. How to adaptively determine proper neighborhood scopes for the aggregation process for both homophilic and heterophilic graphs remains largely unexplored. We thus propose to model the GNNs' message-passing behavior on a graph as a stochastic process by treating the number of hops as a beta process. This Bayesian framework allows us to infer the most plausible neighborhood scope for message aggregation simultaneously with the optimization of GNN parameters. Our theoretical analysis shows that the scope inference improves the expressivity of a GNN. Experiments on benchmark homophilic and heterophilic datasets show that the proposed method is compatible with state-of-the-art GNN variants, achieving competitive or superior performance on the node classification task, and providing well-calibrated predictions.

Bayesian Neighborhood Adaptation for Graph Neural Networks

TL;DR

Abstract

Paper Structure (43 sections, 5 theorems, 38 equations, 11 figures, 13 tables, 1 algorithm)

This paper contains 43 sections, 5 theorems, 38 equations, 11 figures, 13 tables, 1 algorithm.

Introduction
Preliminaries and Related Works
Message Aggregation Schemes
Aggregation Schemes for Heterophilic Graphs
Bayesian Methods for GNNs
Bayesian Neighborhood Adaptation for GNNs
Beta Process Prior over Infinite Neighborhood Scopes
GNN models as a Likelihood
Efficient Variational Approximation
Expressivity Analysis
Experiments
Neighborhood Scope Adaptation
Performance Comparison on GNN Variants
Expressivity with Deep GNN Structures
Uncertainty Quantification
...and 28 more sections

Key Result

Theorem 1

Oono2019:Asymptotic Let $d_{\mathcal{M}}(\mathbf{H})$ denote the perpendicular distance between the representations $\mathbf{H}$ and the subspace $U$, then the output representations from $L^{th}$ layer ($\mathbf{H}_L$) in a GCN exponentially converges to the subspace $U$.

Figures (11)

Figure 1: Illustration of our proposed neighborhood adaptation strategy. Left: The feature of a given node (black-colored) is generated by aggregating messages from neighbors located multiple hops away. The direction of message passing is indicated by arrows. The nodes in each hop $l$ are assigned a contribution probability ($\pi_l$) indicating their contribution in aggregation (color-coded). Right: Visualizing stick-breaking construction of a beta process. The sticks on top are random draws from a beta process, representing the probabilities over the number of hops. The bottom shows the conjugate Bernoulli process over node feature dimensions. Filled circles (blue) indicate a random draw of $1$ confirming the selection of a particular feature.
Figure 2: (left) Illustration of the convergence of feature vector $\mathbf{H}$ in the subspace $U$. $\mathbf{P}$ and $d_{\mathcal{M}}(\mathbf{H})$ are the projection and the perpendicular distance of $\mathbf{H}$ from the subspace respectively. $\theta$ is the size of the angular region spanned by $\mathbf{H}$ around $U$. (right) Visualization of the angular regions spanned by vanilla GCN (grey), ResGCN (blue), and BNA-GCN (purple) around the subspace $U$ (denoted by the dark line).
Figure 3: Evolution of neighborhood scope and contribution probabilities over the number of epochs for the Pubmed dataset when trained with our method. The contribution probabilities $\pi_l$ and hence the neighborhood scope increases as the training progresses and settles to an optimal value.
Figure 4: The impact of increasing the depths ($L/T$) of GNN variants with and without our framework on their expressivity. Although the depth increase degrades the performance of vanilla ResGCN, GAT, and ACM-GCN, the application of our framework stabilizes their performance even for deep network structures.
Figure 5: TSNE visualization of the learned node representations by ResGCN with and without our framework for shallow ($L=T=4$) and deep ($L=T=32/64$) structure. The representations of ResGCN converge in narrow, curve-shaped regions for deep structures. This indicates that the representations converge to a narrow subspace, which is consistent with Corollary \ref{['cor:res']}. Applying our framework (bottom row) addresses this issue, resulting in spread-out representations with deeper network structure. This suggests that the application of our framework enhances the expressivity.
...and 6 more figures

Theorems & Definitions (5)

Theorem 1
Lemma 1
Corollary 1
Theorem 2
Corollary 2

Bayesian Neighborhood Adaptation for Graph Neural Networks

TL;DR

Abstract

Bayesian Neighborhood Adaptation for Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (5)