Table of Contents
Fetching ...

Bayesian nonparametric community detection in assortative stochastic block models

Martina Amongero, Pierpaolo De Blasi

TL;DR

The paper investigates Bayesian nonparametric community detection in assortative stochastic block models, focusing on how enforcing assortativity in the prior affects recovery when the number of communities $k$ is unknown. It introduces a prior that enforces the assortativity constraint $p>q$ by conditioning diagonal and off-diagonal block probabilities on a cutoff $\epsilon$ and develops a marginal Gibbs sampler capable of handling unknown $k$ via Gibbs-type partitions with negative-$\sigma$ priors. An auxiliary-variable scheme is proposed to cope with non-conjugacy and to allow sampling of new clusters, enabling joint inference of $k$, block memberships, and connectivity parameters. Through illustrative examples and extensive simulations on realistic benchmark networks, the method shows improved clustering accuracy and convergence relative to standard SBMs, particularly in weak-signal and heterogeneous settings. The work provides a practical Bayesian framework for robust, data-driven detection of assortative communities with an unknown number of groups, with extensions to weak assortativity and efficiency considerations outlined for future work.

Abstract

Structured data in the form of networks are increasingly common in a number of fields, including the social sciences, biology, physics, computer science, and many others. A key task in network analysis is community detection, which typically consists of dividing the nodes into groups such that nodes within a group are strongly connected, while connections between groups are relatively scarce. A generative model well suited for the formation of such communities is the assortative stochastic block model (SBM), which prescribes a higher probability of a connection between nodes belonging to the same block rather than to different blocks. A recent line of work has utilized Bayesian nonparametric methods to recover communities in the SBM by placing a prior distribution on the number of blocks and estimating block assignments via collapsed Gibbs samplers. However, efficiently incorporating the assortativity constraint through the prior remains an open problem. In this work, we address this gap by studying the effect of enforcing assortativity on Bayesian community detection and identifying the scenarios in which it pays dividends in comparison with standard SBM. We illustrate our findings through an extensive simulation study.

Bayesian nonparametric community detection in assortative stochastic block models

TL;DR

The paper investigates Bayesian nonparametric community detection in assortative stochastic block models, focusing on how enforcing assortativity in the prior affects recovery when the number of communities is unknown. It introduces a prior that enforces the assortativity constraint by conditioning diagonal and off-diagonal block probabilities on a cutoff and develops a marginal Gibbs sampler capable of handling unknown via Gibbs-type partitions with negative- priors. An auxiliary-variable scheme is proposed to cope with non-conjugacy and to allow sampling of new clusters, enabling joint inference of , block memberships, and connectivity parameters. Through illustrative examples and extensive simulations on realistic benchmark networks, the method shows improved clustering accuracy and convergence relative to standard SBMs, particularly in weak-signal and heterogeneous settings. The work provides a practical Bayesian framework for robust, data-driven detection of assortative communities with an unknown number of groups, with extensions to weak assortativity and efficiency considerations outlined for future work.

Abstract

Structured data in the form of networks are increasingly common in a number of fields, including the social sciences, biology, physics, computer science, and many others. A key task in network analysis is community detection, which typically consists of dividing the nodes into groups such that nodes within a group are strongly connected, while connections between groups are relatively scarce. A generative model well suited for the formation of such communities is the assortative stochastic block model (SBM), which prescribes a higher probability of a connection between nodes belonging to the same block rather than to different blocks. A recent line of work has utilized Bayesian nonparametric methods to recover communities in the SBM by placing a prior distribution on the number of blocks and estimating block assignments via collapsed Gibbs samplers. However, efficiently incorporating the assortativity constraint through the prior remains an open problem. In this work, we address this gap by studying the effect of enforcing assortativity on Bayesian community detection and identifying the scenarios in which it pays dividends in comparison with standard SBM. We illustrate our findings through an extensive simulation study.

Paper Structure

This paper contains 8 sections, 43 equations, 6 figures, 2 tables, 4 algorithms.

Figures (6)

  • Figure 1: The left panel illustrates the marginal of $p$ and $q$, the grey horizontal line marks the uniform density. The right panel shows the joint density which is supported on $q<p$.
  • Figure 2: Right: synthetic network featuring a large and densely connected community (red) with links to peripheral nodes, which belong to two smaller communities (blue and green). Left: adjacency matrix with node memberships shown by the colored bar.
  • Figure 3: Posterior clustering of standard (left) and assortative (right) SBM.
  • Figure 4: Posterior similarity matrix for standard (left) and assortative (right) SBM.
  • Figure 5: Joint density of between vs within probabilities for standard (top left) and assortative case (bottom left, top and bottom right). For each $a\neq b$ we plot the joint posterior densities $(P_{aa},P_{ab})$ and $(P_{bb},P_{ab})$. Coloring of classes is consistent with that of Figure \ref{['fig:synthetic']}.
  • ...and 1 more figures