Table of Contents
Fetching ...

Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes

Xuezhen Li, Can M. Le

TL;DR

This work addresses community detection via variational inference in sparse networks, where existing theory largely covers dense graphs and variational losses suffer from saddle points. It proposes a simple hard-thresholding step on the posterior of node labels after each BCAVI iteration, yielding Threshold BCAVI (T-BCAVI). The authors prove nontrivial guarantees under both SBM and DCSBM: the thresholded method achieves accurate recovery with high probability in sparse regimes, provides consistent parameter estimates, and, when degrees grow, establishes asymptotic normality for these estimates; the DCSBM extension shows rate-optimal clustering. Empirically, T-BCAVI outperforms classical BCAVI and a leading sparse-network algorithm in extensive simulations and real data, without requiring pre-processing, highlighting its practical impact for scalable, robust network clustering.

Abstract

Variational inference has been widely used in machine learning literature to fit various Bayesian models. In network analysis, this method has been successfully applied to solve the community detection problems. Although these results are promising, their theoretical support is only for relatively dense networks, an assumption that may not hold for real networks. In addition, it has been shown recently that the variational loss surface has many saddle points, which may severely affect its performance, especially when applied to sparse networks. This paper proposes a simple way to improve the variational inference method by hard thresholding the posterior of the community assignment after each iteration. Using a random initialization that correlates with the true community assignment, we show that the proposed method converges and can accurately recover the true community labels, even when the average node degree of the network is bounded. Extensive numerical study further confirms the advantage of the proposed method over the classical variational inference and another state-of-the-art algorithm.

Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes

TL;DR

This work addresses community detection via variational inference in sparse networks, where existing theory largely covers dense graphs and variational losses suffer from saddle points. It proposes a simple hard-thresholding step on the posterior of node labels after each BCAVI iteration, yielding Threshold BCAVI (T-BCAVI). The authors prove nontrivial guarantees under both SBM and DCSBM: the thresholded method achieves accurate recovery with high probability in sparse regimes, provides consistent parameter estimates, and, when degrees grow, establishes asymptotic normality for these estimates; the DCSBM extension shows rate-optimal clustering. Empirically, T-BCAVI outperforms classical BCAVI and a leading sparse-network algorithm in extensive simulations and real data, without requiring pre-processing, highlighting its practical impact for scalable, robust network clustering.

Abstract

Variational inference has been widely used in machine learning literature to fit various Bayesian models. In network analysis, this method has been successfully applied to solve the community detection problems. Although these results are promising, their theoretical support is only for relatively dense networks, an assumption that may not hold for real networks. In addition, it has been shown recently that the variational loss surface has many saddle points, which may severely affect its performance, especially when applied to sparse networks. This paper proposes a simple way to improve the variational inference method by hard thresholding the posterior of the community assignment after each iteration. Using a random initialization that correlates with the true community assignment, we show that the proposed method converges and can accurately recover the true community labels, even when the average node degree of the network is bounded. Extensive numerical study further confirms the advantage of the proposed method over the classical variational inference and another state-of-the-art algorithm.
Paper Structure (18 sections, 17 theorems, 251 equations, 11 figures, 2 algorithms)

This paper contains 18 sections, 17 theorems, 251 equations, 11 figures, 2 algorithms.

Key Result

Proposition 1

Consider SBM that satisfies Assumption ass:model assumption. In addition, assume that the initialization for the Threshold BCAVI satisfies Assumption ass:perturb with fixed error rate $\varepsilon\in(0,(K-1)/K)$. Let $d=(n/K - 1)p + n(K-1)q/K$ denote the expected average degree. Then there exist con Moreover, for $s \geq 2$,

Figures (11)

  • Figure 1: Performance of Threshold BCAVI (T-BCAVI), the classical BCAVI, majority vote (MV), and majority vote with penalization (P-MV) in balanced settings. (a)-(c): Networks are generated from SBM with $n=600$ nodes, $K=2$ communities of sizes $n_1 = n_2 = 300$. (d)-(f): Networks are generated from SBM with $n=600$ nodes, $K=3$ communities of sizes $n_1 = n_2 = n_3 = 200$. Initializations are generated from true node labels according to Assumption \ref{['ass:perturb']} with error rate $\varepsilon$, resulting in actual clustering initialization accuracy (RI) of approximately $1-\varepsilon$.
  • Figure 2: Performance of Threshold BCAVI (T-BCAVI), the classical BCAVI, majority vote (MV), and majority vote with penalization (P-MV) in unbalanced settings. (a)-(c): Networks are generated from SBM with $n=600$ nodes, $K=2$ communities of sizes $n_1 = 240, n_2 = 360$. (d)-(f): Networks are generated from SBM with $n=600$ nodes, $K=3$ communities of sizes $n_1 = 150, n_2 =210, n_3 = 240$. Initializations are generated from true node labels according to Assumption \ref{['ass:perturb']} with error rate $\varepsilon$, resulting in actual clustering initialization accuracy (RI) of approximately $1-\varepsilon$.
  • Figure 3: Relative errors of parameter estimation by Threshold BCAVI (T) and the classical BCAVI in balanced settings. (a)-(c): Networks are generated from SBM with $n=600$ nodes, $K=2$ communities of sizes $n_1 = n_2 = 300$. (d)-(f): Networks are generated from SBM with $n=600$ nodes, $K=3$ communities of sizes $n_1 = n_2 = n_3 = 200$. Initializations are generated from true node labels according to Assumption \ref{['ass:perturb']} with error rate $\varepsilon$.
  • Figure 4: Performance of Threshold BCAVI (T-BCAVI), the classical BCAVI, majority vote (MV), and majority vote with penalization (P-MV) in balanced settings. (a)-(c): Networks are generated from SBM with $n=600$ nodes, $K=2$ communities of sizes $n_1 = n_2 = 300$. (d)-(f): Networks are generated from SBM with $n=600$ nodes, $K=3$ communities of sizes $n_1 = n_2 = n_3 = 200$. Initializations are computed by spectral clustering (SCI) applied to sampled sub-networks $A^{(\text{init})}$ with sampling probability $\tau$ while T-BCAVI and BCAVI are performed on remaining sub-networks $A-A^{(\text{init})}$.
  • Figure 5: Performance of Threshold BCAVI (T-BCAVI), the classical BCAVI, majority vote (MV), and majority vote with penalization (P-MV) in unbalanced settings. (a)-(c): Networks are generated from SBM with $n=600$ nodes, $K=2$ communities of sizes $n_1 = 240, n_2 = 360$. (d)-(f): Networks are generated from SBM with $n=600$ nodes, $K=3$ communities of sizes $n_1 = 150, n_2 =210, n_3 = 240$. Initializations are computed by spectral clustering (SCI) applied to sampled sub-networks $A^{(\text{init})}$ with sampling probability $\tau$ while T-BCAVI and BCAVI are performed on remaining sub-networks $A-A^{(\text{init})}$.
  • ...and 6 more figures

Theorems & Definitions (17)

  • Proposition 1: Parameter estimation for SBM
  • Theorem 2: Clustering accuracy for SBM
  • Theorem 3: Limiting distribution of parameter estimates for SBM
  • Proposition 4: Parameter estimation for DCSBM
  • Theorem 5: Clustering accuracy of Threshold BCAVI in DCSBM
  • Lemma 6: Concentration inequality
  • Lemma 7: Number of removed edges
  • Lemma 8: Concentration of regularized adjacency matrices
  • Lemma 9: Propositions of the parameter $\lambda$
  • Lemma 10: First step: Parameter estimation
  • ...and 7 more