Table of Contents
Fetching ...

Optimal recovery by maximum and integrated conditional likelihood in the general Stochastic Block Model

Andressa Cerqueira, Florencia Leonardi

TL;DR

It is shown that maximum conditional likelihood achieves the optimal known threshold for exact recovery in the logarithmic degree regime and the integrated conditional likelihood gets a sub-optimal constant in the same regime.

Abstract

In this paper, we obtain new results on the weak and strong consistency of the maximum and integrated conditional likelihood estimators for the community detection problem in the Stochastic Block Model with k communities. In particular, we show that maximum conditional likelihood achieves the optimal known threshold for exact recovery in the logarithmic degree regime. For the integrated conditional likelihood, we obtain a sub-optimal constant in the same regime. Both methods are shown to be weakly consistent in the divergent degree regime. These results confirm the optimality of maximum likelihood on the task of community detection, something that has remained as an open problem until now.

Optimal recovery by maximum and integrated conditional likelihood in the general Stochastic Block Model

TL;DR

It is shown that maximum conditional likelihood achieves the optimal known threshold for exact recovery in the logarithmic degree regime and the integrated conditional likelihood gets a sub-optimal constant in the same regime.

Abstract

In this paper, we obtain new results on the weak and strong consistency of the maximum and integrated conditional likelihood estimators for the community detection problem in the Stochastic Block Model with k communities. In particular, we show that maximum conditional likelihood achieves the optimal known threshold for exact recovery in the logarithmic degree regime. For the integrated conditional likelihood, we obtain a sub-optimal constant in the same regime. Both methods are shown to be weakly consistent in the divergent degree regime. These results confirm the optimality of maximum likelihood on the task of community detection, something that has remained as an open problem until now.
Paper Structure (10 sections, 14 theorems, 184 equations, 2 figures)

This paper contains 10 sections, 14 theorems, 184 equations, 2 figures.

Key Result

Lemma 2.1

For all $n$ we have that

Figures (2)

  • Figure 1: Mean of NMI between estimated and true communities membership over 50 simulated balanced networks with $n=200$, $\rho_n=\log n/ n$ for the approximate solutions of ML and ICL estimators. The dashed vertical line shows the value of the constant at which the phase transition occurs.
  • Figure 2: Mean of NMI between estimated and true communities membership over 50 simulated balanced networks with $n=200$, $k=2$, $(\sqrt{s_1}-\sqrt{s_2})^2>2$ for the approximation solution of the ML and ICL estimators. The dashed line is set at $\rho_n=\log n / n$.

Theorems & Definitions (29)

  • Lemma 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Remark 3.3
  • Theorem 3.4
  • proof : Proof of Theorem \ref{['teorema-weak']}
  • proof : Proof of Theorem \ref{['teorema-chave']}
  • proof : Proof of Theorem \ref{['teorema-chave-icl']}
  • proof : Proof of Lemma \ref{['lemma-qml-qb']}
  • Lemma 7.1
  • ...and 19 more