Exact recovery for seeded graph matching
Nicolas Fraiman, Michael Nisenzon
TL;DR
This work analyzes exact matching between two correlated graphs in the almost fully seeded regime under the Seeded Correlated SBM, showing a sharp threshold: exact recovery is possible in polynomial time whenever $\lambda s^2 > 1-\alpha$, bridging the fully seeded and unseeded settings. It develops and analyzes multiple algorithms, including simple neighborhood-overlap methods and a seeded $\ell_1$-LP with projection (and a fast Frank–Wolfe variant), proving algorithmic achievability above the threshold and information-theoretic impossibility below. The results provide the first tight statistical/computational characterization for graph matching with a vanishing fraction of unrevealed vertices, and empirically demonstrate robust performance on synthetic and real networks, with implications for semi-supervised community detection. Overall, the paper shows that revealing all but $n^{1-\alpha}$ correspondences substantially lowers the information threshold and enables efficient exact recovery in a broad, practically relevant regime.
Abstract
We study graph matching between two correlated networks in the almost fully seeded regime, where all but a vanishing fraction of vertex correspondences are revealed. Concretely, we consider the correlated stochastic block model and assume that $n^{1-α}$ vertices remain unrevealed for some $α\in (0,1)$, while the remaining $n - n^{1-α}$ vertices are provided as seed correspondences. Our goal is to determine when the true permutation can be recovered efficiently as the proportion of unrevealed vertices vanishes. We prove that exact recovery of the remaining correspondences is achievable in polynomial time whenever $λs^{2} > 1 - α$, where $λ= (a+b)/2$ is the SBM density parameter and $s$ denotes the edge retention parameter. This condition smoothly interpolates between the fully seeded setting and the classical unseeded threshold $λs^{2} > 1$ for matching in correlated Erdős-Rényi graphs. Our analysis applies to both a simple neighborhood-overlap rule and a bistochastic relaxation followed by projection, establishing matching achievability in the almost fully seeded regime without requiring spectral methods or message passing. On the converse side, we show that below the same threshold, exact recovery is information-theoretically impossible with high probability. Thus, to our knowledge, we obtain the first tight statistical and computational characterization of graph matching when only a vanishing fraction of vertices remain unrevealed. Our results complement recent progress in semi-supervised community detection by demonstrating that revealing all but $n^{1-α}$ correspondences similarly lowers the information threshold for graph matching.
