Table of Contents
Fetching ...

Perfect Recovery for Random Geometric Graph Matching with Shallow Graph Neural Networks

Suqi Liu, Morgane Austern

TL;DR

This work analyzes graph alignment when two graphs are noisy, edge-subsampled copies of a random geometric graph with sparse binary features. It introduces a two-layer Graph Neural Network with a simple thresholding scheme to produce embeddings, then solves a Hungarian assignment to recover the true vertex permutation. The authors prove that perfect recovery is achievable with high probability under a parameter regime where $\min\{s, \frac{qm}{s}, \frac{qm}{\sigma^2 s^2}\} \gg \log n + \log d$, and show that the noise bound is tight up to log factors; they also show that a direct feature-only matching can fail in regimes where the GNN succeeds. Empirical results on synthetic data and real datasets (Cora, CiteSeer) corroborate the theory, highlighting that GNN-based alignment leverages both graph structure and noisy features to outperform linear methods, especially as noise grows with graph size. The study provides theoretical grounding for the effectiveness of shallow GNNs in graph alignment and illuminates the bias-variance trade-off inherent in aggregating neighbor information on geometric graphs.

Abstract

We study the graph matching problem in the presence of vertex feature information using shallow graph neural networks. Specifically, given two graphs that are independent perturbations of a single random geometric graph with sparse binary features, the task is to recover an unknown one-to-one mapping between the vertices of the two graphs. We show under certain conditions on the sparsity and noise level of the feature vectors, a carefully designed two-layer graph neural network can, with high probability, recover the correct mapping between the vertices with the help of the graph structure. Additionally, we prove that our condition on the noise parameter is tight up to logarithmic factors. Finally, we compare the performance of the graph neural network to directly solving an assignment problem using the noisy vertex features and demonstrate that when the noise level is at least constant, this direct matching fails to achieve perfect recovery, whereas the graph neural network can tolerate noise levels growing as fast as a power of the size of the graph. Our theoretical findings are further supported by numerical studies as well as real-world data experiments.

Perfect Recovery for Random Geometric Graph Matching with Shallow Graph Neural Networks

TL;DR

This work analyzes graph alignment when two graphs are noisy, edge-subsampled copies of a random geometric graph with sparse binary features. It introduces a two-layer Graph Neural Network with a simple thresholding scheme to produce embeddings, then solves a Hungarian assignment to recover the true vertex permutation. The authors prove that perfect recovery is achievable with high probability under a parameter regime where , and show that the noise bound is tight up to log factors; they also show that a direct feature-only matching can fail in regimes where the GNN succeeds. Empirical results on synthetic data and real datasets (Cora, CiteSeer) corroborate the theory, highlighting that GNN-based alignment leverages both graph structure and noisy features to outperform linear methods, especially as noise grows with graph size. The study provides theoretical grounding for the effectiveness of shallow GNNs in graph alignment and illuminates the bias-variance trade-off inherent in aggregating neighbor information on geometric graphs.

Abstract

We study the graph matching problem in the presence of vertex feature information using shallow graph neural networks. Specifically, given two graphs that are independent perturbations of a single random geometric graph with sparse binary features, the task is to recover an unknown one-to-one mapping between the vertices of the two graphs. We show under certain conditions on the sparsity and noise level of the feature vectors, a carefully designed two-layer graph neural network can, with high probability, recover the correct mapping between the vertices with the help of the graph structure. Additionally, we prove that our condition on the noise parameter is tight up to logarithmic factors. Finally, we compare the performance of the graph neural network to directly solving an assignment problem using the noisy vertex features and demonstrate that when the noise level is at least constant, this direct matching fails to achieve perfect recovery, whereas the graph neural network can tolerate noise levels growing as fast as a power of the size of the graph. Our theoretical findings are further supported by numerical studies as well as real-world data experiments.
Paper Structure (17 sections, 26 theorems, 148 equations, 5 figures, 3 tables)

This paper contains 17 sections, 26 theorems, 148 equations, 5 figures, 3 tables.

Key Result

Theorem 1

Let the matching problem be defined in Section sc:model, and we solve it using eq:mp_align. With probability approaching $1$ as $n \to \infty$ we recover both the true vertex features and the matching if

Figures (5)

  • Figure 1: Phase diagram of perfect recovery for $t=1$ (left) and $t=2$ (right). Here $\sigma^2 \asymp 1/\sqrt{n}$.
  • Figure 2: Comparison of the GNN and the linear method. Parameters in the experiments are $n=4000$, $d=200$, $s=10$, $t=3$. For (a), $q=0.8$ is fixed and $\sigma$ ranges from $0.1$ to $1$ in $0.1$ increments. For (b), $\sigma=0.4$ and $q$ changes from $0.1$ to $1$ in $0.1$ increments.
  • Figure 3: Comparison of the GNN and the linear method on real-world datasets. In the plots on the left of each group, $q = 1$ is fixed (using all edges from the datasets) and $\sigma$ varies from $0$ to $0.5$ in $0.1$ increments. In the plots on the right of each group, $\sigma = 0.4$ is fixed and $q$ varies from $0.5$ to $1$ in $0.1$ increments.
  • Figure 4: Impact of the threshold parameter on real-world datasets. We fix $q=1$ and $\sigma=0.4$.
  • Figure 5: Impact of the noise parameter on real-world datasets.

Theorems & Definitions (51)

  • Definition 1: Random intersection graph
  • Definition 2: Noisy and incomplete RIG
  • Theorem 1: Perfect recovery
  • Remark 1: Reparameterization
  • Remark 2: Phase diagram
  • Remark 3: Correlated random graph matching
  • Remark 4: Trainability of the GNN
  • Theorem 2: Impossibility of perfect recovery
  • Remark 5
  • Theorem 3: Impossibility of perfect recovery with vertex features
  • ...and 41 more