Table of Contents
Fetching ...

Bootstrap Learning for Combinatorial Graph Alignment with Sequential GNNs

Marc Lelarge

TL;DR

This work introduces a novel chaining procedure for the graph alignment problem, a fundamental NP-hard task of finding optimal node correspondences between unlabeled graphs using only structural information, and substantially outperforms state-of-the-art solvers on the graph alignment benchmark.

Abstract

Graph neural networks (GNNs) have struggled to outperform traditional optimization methods on combinatorial problems, limiting their practical impact. We address this gap by introducing a novel chaining procedure for the graph alignment problem, a fundamental NP-hard task of finding optimal node correspondences between unlabeled graphs using only structural information. Our method trains a sequence of GNNs where each network learns to iteratively refine similarity matrices produced by previous networks. During inference, this creates a bootstrap effect: each GNN improves upon partial solutions by incorporating discrete ranking information about node alignment quality from prior iterations. We combine this with a powerful architecture that operates on node pairs rather than individual nodes, capturing global structural patterns essential for alignment that standard message-passing networks cannot represent. Extensive experiments on synthetic benchmarks demonstrate substantial improvements: our chained GNNs achieve over 3x better accuracy than existing methods on challenging instances, and uniquely solve regular graphs where all competing approaches fail. When combined with traditional optimization as post-processing, our method substantially outperforms state-of-the-art solvers on the graph alignment benchmark.

Bootstrap Learning for Combinatorial Graph Alignment with Sequential GNNs

TL;DR

This work introduces a novel chaining procedure for the graph alignment problem, a fundamental NP-hard task of finding optimal node correspondences between unlabeled graphs using only structural information, and substantially outperforms state-of-the-art solvers on the graph alignment benchmark.

Abstract

Graph neural networks (GNNs) have struggled to outperform traditional optimization methods on combinatorial problems, limiting their practical impact. We address this gap by introducing a novel chaining procedure for the graph alignment problem, a fundamental NP-hard task of finding optimal node correspondences between unlabeled graphs using only structural information. Our method trains a sequence of GNNs where each network learns to iteratively refine similarity matrices produced by previous networks. During inference, this creates a bootstrap effect: each GNN improves upon partial solutions by incorporating discrete ranking information about node alignment quality from prior iterations. We combine this with a powerful architecture that operates on node pairs rather than individual nodes, capturing global structural patterns essential for alignment that standard message-passing networks cannot represent. Extensive experiments on synthetic benchmarks demonstrate substantial improvements: our chained GNNs achieve over 3x better accuracy than existing methods on challenging instances, and uniquely solve regular graphs where all competing approaches fail. When combined with traditional optimization as post-processing, our method substantially outperforms state-of-the-art solvers on the graph alignment benchmark.

Paper Structure

This paper contains 36 sections, 2 theorems, 16 equations, 10 figures, 12 tables.

Key Result

Proposition A.1

For $A$, $B$ Euclidean distance matrices, the indefinite relaxation eq:convrelax is tight and solves the GAP eq:gm. In this case, the GAP computes the Gromov-Monge distance and the indefinite relaxation computes the Gromov-Wasserstein distance.

Figures (10)

  • Figure 1: Illustration of Step 2. The permutation $\pi$ maps 1→c, 2→a, 3→b, 4→d. Green edges show matches: edge 1-2 with a-c, and edge 1-3 with b-c. Node 1 has the highest score (2 matched edges), nodes 2 and 3 each have 1 matched edge, and node 4 has no matched edges.
  • Figure 2: Overview of the chaining procedure. Starting from input graphs $G_A$ and $G_B$, we first (1) extract features and compute similarities, then iteratively (2) rank nodes by alignment quality, and (3) use rankings to enhance features and similarities.
  • Figure 3: Each line corresponds to chained FGNNs trained at a given level of noise and evaluated across all different level of noises. Performances are ${\textbf{acc}}$ (in %) for sparse Erdős-Rényi graphs with ${\textbf{Proj}}$ as post-processing.
  • Figure 4: Accuracy ${\textbf{acc}}$ as a function of the noise level for correlated Erdős-Rényi random graphs with size $n=1000$ and average degree $d=3$. Chained GNNs were trained at noise level $0.25$ and ${\textbf{FAQ}}$ is used as the last step for the inference. The red curve labeled FAQ corresponds to ${\textbf{FAQ}}(D_{\text{cx}})$ and the blue curve labeled message passing are results from muratori2024faster. The dashed vertical line corresponds to the theoretical $p_{\text{algo}}=1-\sqrt{\alpha}$ above which no efficient algorithm is known to succeed.
  • Figure 5: Left: Training of chained GNNs. Each color corresponds to a different GNN and training run. The first model (brown) reaches an accuracy below $0.1$. The second (magenta), which takes as input the output of the first, achieves an accuracy of about $0.15$. Subsequent models, each using the output of the previous one as input, attain progressively higher accuracy. Right (top): ${\textbf{acc}}$; Right (bottom): ${\textbf{nce}}$. The first violin plot (${\bf \_faq}$, green) corresponds to ${\textbf{FAQ}}$. The following violin plots show results for different numbers of looping iterations $N_{\max}=0,1,\dots,9$: ${\bf \_gnn\_faq\_proj}$ (blue, with ${\textbf{Proj}}$) and ${\bf \_gnn\_faq\_faq}$ (orange, with ${\textbf{FAQ}}$).
  • ...and 5 more figures

Theorems & Definitions (2)

  • Proposition A.1
  • Theorem A.2