Exact recovery for seeded graph matching

Nicolas Fraiman; Michael Nisenzon

Exact recovery for seeded graph matching

Nicolas Fraiman, Michael Nisenzon

TL;DR

This work analyzes exact matching between two correlated graphs in the almost fully seeded regime under the Seeded Correlated SBM, showing a sharp threshold: exact recovery is possible in polynomial time whenever $\lambda s^2 > 1-\alpha$, bridging the fully seeded and unseeded settings. It develops and analyzes multiple algorithms, including simple neighborhood-overlap methods and a seeded $\ell_1$-LP with projection (and a fast Frank–Wolfe variant), proving algorithmic achievability above the threshold and information-theoretic impossibility below. The results provide the first tight statistical/computational characterization for graph matching with a vanishing fraction of unrevealed vertices, and empirically demonstrate robust performance on synthetic and real networks, with implications for semi-supervised community detection. Overall, the paper shows that revealing all but $n^{1-\alpha}$ correspondences substantially lowers the information threshold and enables efficient exact recovery in a broad, practically relevant regime.

Abstract

We study graph matching between two correlated networks in the almost fully seeded regime, where all but a vanishing fraction of vertex correspondences are revealed. Concretely, we consider the correlated stochastic block model and assume that $n^{1-α}$ vertices remain unrevealed for some $α\in (0,1)$, while the remaining $n - n^{1-α}$ vertices are provided as seed correspondences. Our goal is to determine when the true permutation can be recovered efficiently as the proportion of unrevealed vertices vanishes. We prove that exact recovery of the remaining correspondences is achievable in polynomial time whenever $λs^{2} > 1 - α$, where $λ= (a+b)/2$ is the SBM density parameter and $s$ denotes the edge retention parameter. This condition smoothly interpolates between the fully seeded setting and the classical unseeded threshold $λs^{2} > 1$ for matching in correlated Erdős-Rényi graphs. Our analysis applies to both a simple neighborhood-overlap rule and a bistochastic relaxation followed by projection, establishing matching achievability in the almost fully seeded regime without requiring spectral methods or message passing. On the converse side, we show that below the same threshold, exact recovery is information-theoretically impossible with high probability. Thus, to our knowledge, we obtain the first tight statistical and computational characterization of graph matching when only a vanishing fraction of vertices remain unrevealed. Our results complement recent progress in semi-supervised community detection by demonstrating that revealing all but $n^{1-α}$ correspondences similarly lowers the information threshold for graph matching.

Exact recovery for seeded graph matching

TL;DR

, bridging the fully seeded and unseeded settings. It develops and analyzes multiple algorithms, including simple neighborhood-overlap methods and a seeded

-LP with projection (and a fast Frank–Wolfe variant), proving algorithmic achievability above the threshold and information-theoretic impossibility below. The results provide the first tight statistical/computational characterization for graph matching with a vanishing fraction of unrevealed vertices, and empirically demonstrate robust performance on synthetic and real networks, with implications for semi-supervised community detection. Overall, the paper shows that revealing all but

correspondences substantially lowers the information threshold and enables efficient exact recovery in a broad, practically relevant regime.

Abstract

vertices remain unrevealed for some

, while the remaining

vertices are provided as seed correspondences. Our goal is to determine when the true permutation can be recovered efficiently as the proportion of unrevealed vertices vanishes. We prove that exact recovery of the remaining correspondences is achievable in polynomial time whenever

, where

is the SBM density parameter and

denotes the edge retention parameter. This condition smoothly interpolates between the fully seeded setting and the classical unseeded threshold

for matching in correlated Erdős-Rényi graphs. Our analysis applies to both a simple neighborhood-overlap rule and a bistochastic relaxation followed by projection, establishing matching achievability in the almost fully seeded regime without requiring spectral methods or message passing. On the converse side, we show that below the same threshold, exact recovery is information-theoretically impossible with high probability. Thus, to our knowledge, we obtain the first tight statistical and computational characterization of graph matching when only a vanishing fraction of vertices remain unrevealed. Our results complement recent progress in semi-supervised community detection by demonstrating that revealing all but

correspondences similarly lowers the information threshold for graph matching.

Paper Structure (14 sections, 11 theorems, 102 equations, 1 figure, 3 tables, 4 algorithms)

This paper contains 14 sections, 11 theorems, 102 equations, 1 figure, 3 tables, 4 algorithms.

Introduction
Seeded graph matching
Our contribution
Setup and Notation
Description of the Algorithms
Proofs of the Main Results
Lower Bound: Impossibility Below the Threshold
Upper Bound: Algorithmic Achievability
Experiments and Discussion
Experimental Protocol
Algorithms
Datasets
Results
Discussion

Key Result

Lemma 1

Let $(A,B) \sim \mathop{\mathrm{SCSBM}}\nolimits(n,a,b,s,\mathcal{R},\sigma^*,\pi^*)$ be defined with $|\mathcal{U}| = n^{1-\alpha}$ for some $\alpha \in [0,1]$. Then for any unrevealed $u,v \in \mathcal{U}$ with $v \neq \pi^*(u)$, we have that for any $\varepsilon > 0$, the size of the intersection In addition, if $v = \pi^*(u)$, then for any $\varepsilon > 0$,

Figures (1)

Figure 1: Accuracy as a function of seed fraction for CSBM on 1000 nodes.

Theorems & Definitions (26)

Definition 1: Seeded Correlated SBM
Definition 2: Exact Graph Matching
Definition 3: Equivariance
Definition 4: Hard isolated vertices
Lemma 1
proof
Lemma 2
proof
Lemma 3
proof
...and 16 more

Exact recovery for seeded graph matching

TL;DR

Abstract

Exact recovery for seeded graph matching

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (26)