Table of Contents
Fetching ...

Exact recovery for seeded graph matching

Nicolas Fraiman, Michael Nisenzon

TL;DR

This work analyzes exact matching between two correlated graphs in the almost fully seeded regime under the Seeded Correlated SBM, showing a sharp threshold: exact recovery is possible in polynomial time whenever $\lambda s^2 > 1-\alpha$, bridging the fully seeded and unseeded settings. It develops and analyzes multiple algorithms, including simple neighborhood-overlap methods and a seeded $\ell_1$-LP with projection (and a fast Frank–Wolfe variant), proving algorithmic achievability above the threshold and information-theoretic impossibility below. The results provide the first tight statistical/computational characterization for graph matching with a vanishing fraction of unrevealed vertices, and empirically demonstrate robust performance on synthetic and real networks, with implications for semi-supervised community detection. Overall, the paper shows that revealing all but $n^{1-\alpha}$ correspondences substantially lowers the information threshold and enables efficient exact recovery in a broad, practically relevant regime.

Abstract

We study graph matching between two correlated networks in the almost fully seeded regime, where all but a vanishing fraction of vertex correspondences are revealed. Concretely, we consider the correlated stochastic block model and assume that $n^{1-α}$ vertices remain unrevealed for some $α\in (0,1)$, while the remaining $n - n^{1-α}$ vertices are provided as seed correspondences. Our goal is to determine when the true permutation can be recovered efficiently as the proportion of unrevealed vertices vanishes. We prove that exact recovery of the remaining correspondences is achievable in polynomial time whenever $λs^{2} > 1 - α$, where $λ= (a+b)/2$ is the SBM density parameter and $s$ denotes the edge retention parameter. This condition smoothly interpolates between the fully seeded setting and the classical unseeded threshold $λs^{2} > 1$ for matching in correlated Erdős-Rényi graphs. Our analysis applies to both a simple neighborhood-overlap rule and a bistochastic relaxation followed by projection, establishing matching achievability in the almost fully seeded regime without requiring spectral methods or message passing. On the converse side, we show that below the same threshold, exact recovery is information-theoretically impossible with high probability. Thus, to our knowledge, we obtain the first tight statistical and computational characterization of graph matching when only a vanishing fraction of vertices remain unrevealed. Our results complement recent progress in semi-supervised community detection by demonstrating that revealing all but $n^{1-α}$ correspondences similarly lowers the information threshold for graph matching.

Exact recovery for seeded graph matching

TL;DR

This work analyzes exact matching between two correlated graphs in the almost fully seeded regime under the Seeded Correlated SBM, showing a sharp threshold: exact recovery is possible in polynomial time whenever , bridging the fully seeded and unseeded settings. It develops and analyzes multiple algorithms, including simple neighborhood-overlap methods and a seeded -LP with projection (and a fast Frank–Wolfe variant), proving algorithmic achievability above the threshold and information-theoretic impossibility below. The results provide the first tight statistical/computational characterization for graph matching with a vanishing fraction of unrevealed vertices, and empirically demonstrate robust performance on synthetic and real networks, with implications for semi-supervised community detection. Overall, the paper shows that revealing all but correspondences substantially lowers the information threshold and enables efficient exact recovery in a broad, practically relevant regime.

Abstract

We study graph matching between two correlated networks in the almost fully seeded regime, where all but a vanishing fraction of vertex correspondences are revealed. Concretely, we consider the correlated stochastic block model and assume that vertices remain unrevealed for some , while the remaining vertices are provided as seed correspondences. Our goal is to determine when the true permutation can be recovered efficiently as the proportion of unrevealed vertices vanishes. We prove that exact recovery of the remaining correspondences is achievable in polynomial time whenever , where is the SBM density parameter and denotes the edge retention parameter. This condition smoothly interpolates between the fully seeded setting and the classical unseeded threshold for matching in correlated Erdős-Rényi graphs. Our analysis applies to both a simple neighborhood-overlap rule and a bistochastic relaxation followed by projection, establishing matching achievability in the almost fully seeded regime without requiring spectral methods or message passing. On the converse side, we show that below the same threshold, exact recovery is information-theoretically impossible with high probability. Thus, to our knowledge, we obtain the first tight statistical and computational characterization of graph matching when only a vanishing fraction of vertices remain unrevealed. Our results complement recent progress in semi-supervised community detection by demonstrating that revealing all but correspondences similarly lowers the information threshold for graph matching.
Paper Structure (14 sections, 11 theorems, 102 equations, 1 figure, 3 tables, 4 algorithms)

This paper contains 14 sections, 11 theorems, 102 equations, 1 figure, 3 tables, 4 algorithms.

Key Result

Lemma 1

Let $(A,B) \sim \mathop{\mathrm{SCSBM}}\nolimits(n,a,b,s,\mathcal{R},\sigma^*,\pi^*)$ be defined with $|\mathcal{U}| = n^{1-\alpha}$ for some $\alpha \in [0,1]$. Then for any unrevealed $u,v \in \mathcal{U}$ with $v \neq \pi^*(u)$, we have that for any $\varepsilon > 0$, the size of the intersection In addition, if $v = \pi^*(u)$, then for any $\varepsilon > 0$,

Figures (1)

  • Figure 1: Accuracy as a function of seed fraction for CSBM on 1000 nodes.

Theorems & Definitions (26)

  • Definition 1: Seeded Correlated SBM
  • Definition 2: Exact Graph Matching
  • Definition 3: Equivariance
  • Definition 4: Hard isolated vertices
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • ...and 16 more