Table of Contents
Fetching ...

Multi-View Stochastic Block Models

Vincent Cohen-Addad, Tommaso d'Orsi, Silvio Lattanzi, Rajai Nasser

TL;DR

This paper formalizes a new family of models, called \textit{multi-view stochastic block models} that captures this setting and introduces a new efficient algorithm that provably outperforms previous approaches by analyzing the structure of each graph separately.

Abstract

Graph clustering is a central topic in unsupervised learning with a multitude of practical applications. In recent years, multi-view graph clustering has gained a lot of attention for its applicability to real-world instances where one has access to multiple data sources. In this paper we formalize a new family of models, called \textit{multi-view stochastic block models} that captures this setting. For this model, we first study efficient algorithms that naively work on the union of multiple graphs. Then, we introduce a new efficient algorithm that provably outperforms previous approaches by analyzing the structure of each graph separately. Furthermore, we complement our results with an information-theoretic lower bound studying the limits of what can be done in this model. Finally, we corroborate our results with experimental evaluations.

Multi-View Stochastic Block Models

TL;DR

This paper formalizes a new family of models, called \textit{multi-view stochastic block models} that captures this setting and introduces a new efficient algorithm that provably outperforms previous approaches by analyzing the structure of each graph separately.

Abstract

Graph clustering is a central topic in unsupervised learning with a multitude of practical applications. In recent years, multi-view graph clustering has gained a lot of attention for its applicability to real-world instances where one has access to multiple data sources. In this paper we formalize a new family of models, called \textit{multi-view stochastic block models} that captures this setting. For this model, we first study efficient algorithms that naively work on the union of multiple graphs. Then, we introduce a new efficient algorithm that provably outperforms previous approaches by analyzing the structure of each graph separately. Furthermore, we complement our results with an information-theoretic lower bound studying the limits of what can be done in this model. Finally, we corroborate our results with experimental evaluations.
Paper Structure (29 sections, 14 theorems, 141 equations, 2 figures, 2 algorithms)

This paper contains 29 sections, 14 theorems, 141 equations, 2 figures, 2 algorithms.

Key Result

Theorem 1.2

Let $n,k>0\,.$ Let $(\mathbf{z},(\mathbf{f_1,\mathbf{G}_1}),\ldots(\mathbf{f_t},\mathbf{G}_t))\sim\textnormal{($\mathcal{T}$,k,t)-}\mathsf{MV\textnormal{-}SBM}_{n}$ for a sequence of tuples $\mathcal{T}=\{(d_\ell,\varepsilon_\ell)\}_{\ell=1}^t\,,$ each satisfying $d_\ell\cdot \varepsilon_\ell^2/4> 1

Figures (2)

  • Figure 1: Fixing $t=10$, $n=1000$, $k=10$, $d=50$ and varying $\varepsilon$ in $[0.5,1.5]$.
  • Figure 2: Fixing $t=10$, $n=1000$, $k=10$, $\varepsilon=0.5$ and varying $d$ in $[50, 150]$.

Theorems & Definitions (38)

  • Theorem 1.2: Weak recovery for multi-view models
  • Theorem 1.3: Lower bound for multi-view models - Informal
  • Corollary 1.4: Exact recovery for multi-view models
  • Remark 2.1: Connection with liu2022statistical
  • Theorem 3.1: Pair-wise weak recovery for unbalanced $2$ communities stochastic block model
  • Remark 4.1: Running time
  • Definition 4.2: Balanced vector
  • Lemma 4.4: Graphs structure from good instances
  • proof : Proof of \ref{['lemma:structure-of-graph']}
  • Remark 4.6: Running time
  • ...and 28 more