Optimal Inference in Contextual Stochastic Block Models

O. Duranthon; L. Zdeborová

Optimal Inference in Contextual Stochastic Block Models

O. Duranthon, L. Zdeborová

TL;DR

It is shown that there can be a considerable gap between the accuracy reached by this algorithm and the performance of the GNN architectures proposed in the literature, suggesting that the cSBM, along with the comparison to theperformance of the optimal algorithm, can be instrumental in the development of more performant GNN architecture.

Abstract

The contextual stochastic block model (cSBM) was proposed for unsupervised community detection on attributed graphs where both the graph and the high-dimensional node information correlate with node labels. In the context of machine learning on graphs, the cSBM has been widely used as a synthetic dataset for evaluating the performance of graph-neural networks (GNNs) for semi-supervised node classification. We consider a probabilistic Bayes-optimal formulation of the inference problem and we derive a belief-propagation-based algorithm for the semi-supervised cSBM; we conjecture it is optimal in the considered setting and we provide its implementation. We show that there can be a considerable gap between the accuracy reached by this algorithm and the performance of the GNN architectures proposed in the literature. This suggests that the cSBM, along with the comparison to the performance of the optimal algorithm, readily accessible via our implementation, can be instrumental in the development of more performant GNN architectures.

Optimal Inference in Contextual Stochastic Block Models

TL;DR

Abstract

Paper Structure (34 sections, 61 equations, 9 figures)

This paper contains 34 sections, 61 equations, 9 figures.

Introduction
Related work
Setup
Contextual stochastic block model (CSBM)
Bayes-optimal estimation
Detectability threshold and the effective signal-to-noise ratio
The AMP--BP Algorithm
Related work on message passing algorithms in CSBM
Bayes-optimal performance
Dense limit
Parameter estimation and Bethe free entropy
Semi-supervision and noisy labels
Comparison against graph neural networks
Comparison to GPR-GNN from previous literature
Baseline graph neural networks
...and 19 more sections

Figures (9)

Figure 1: Convergence to the high-dimensional limit. Overlap $q_U$ of the fixed point of AMP--BP vs the snr $\lambda$ for several system sizes $N$. Left: unsupervised case, $\rho=0$. Right: semi-supervised, $\rho=0.1$. The other parameters are $\alpha=10$, $\mu^2=4$, $d=5$. We run ten experiments per point.
Figure 2: Performances of AMP--BP and of the spectral algorithm of cSBM18 sec. 4. Overlap $q_U$ of the fixed point of the algorithms, vs snr $\lambda$ for a range of ratios $\alpha$. Left: unsupervised, $\rho=0$; right: semi-supervised, $\rho=0.1$. Vertical dashed lines on the left: theoretical thresholds $\lambda_c$ to partial recovery, eq. \ref{['eq:lambda_c']}. $N=3\times 10^4$, $\mu^2=4$, $d=5$. We run ten experiments per point.
Figure 3: Comparison against GPR-GNN pageRankGNN20. Overlap $q_U$ achieved by the algorithms, vs $\varphi=\frac{2}{\pi}\arctan(\frac{\lambda\sqrt\alpha}{\mu})$. Left: few nodes revealed $\rho=0.025$; right: more nodes revealed $\rho=0.6$. For GPR-GNN we plot the results of Fig. 2 and tables 5 and 6 from pageRankGNN20. $N=5\times 10^3$, $\alpha=2.5$, $\epsilon=3.25$, $d=5$. We run ten experiments per point for AMP--BP.
Figure 4: Comparison to GNNs of various architectures and convergence to a high-dimensional limit. Overlap $q_U$ achieved by the GNNs, vs the snr $\lambda$. Left: general convolution for different numbers of layers $K$; middle: for different types of convolutions, at the best $K$ (the detailed results for every $K$ are reported on Fig. \ref{['fig:comparisonK_bis']} of appendix \ref{['sec:appendixFigures']}); right: general convolution at $K=3$ for different sizes $N$. The other parameters are $N=3\times 10^4$, $\alpha=10$, $\mu^2=4$, $d=5$, $\rho=0.1$. We run five experiments per point.
Figure 5: Comparison against clipGNN baranwal23clipGNN. Overlap $q_U$ achieved by the algorithms, vs $\lambda$. $l$ is the size of the neigborhood clipGNN processes. Left:$\mu^2=50$ and $\alpha=50$ i.e. $P=200$; right:$\mu^2=500$ and $\alpha=500$ i.e. $P=20$. The other parameters are $N=10^4$ ($N=5\times 10^3$ for the two largest $l$), $\rho=0.05$, $d=5$, $L=1$. For clipGNN we run the code kindly provided by the authors; we run five experiments per point.
...and 4 more figures

Optimal Inference in Contextual Stochastic Block Models

TL;DR

Abstract

Optimal Inference in Contextual Stochastic Block Models

Authors

TL;DR

Abstract

Table of Contents

Figures (9)