Table of Contents
Fetching ...

Efficient estimation of the modified Gromov-Hausdorff distance between unweighted graphs

Vladyslav Oles, Nathan Lemons, Alexander Panchenko

TL;DR

The paper tackles the computational intractability of the Gromov–Hausdorff distance by focusing on its modified variant $\widehat{d}_{\mathcal{GH}}$ and leveraging curvature sets into a structural decomposition. It develops polynomial-time lower-bound procedures and an algorithm to estimate $\widehat{d}_{\mathcal{GH}}$, with a concrete implementation for metric spaces induced by unweighted graphs in the scikit-tda library. Empirical results on real-world networks (e.g., Enron, LANL, ABIDE I) and synthetic graphs show frequent exact recovery and meaningful distance-based outlier detection, highlighting practical usefulness for graph-shape matching and anomaly analysis. The work delivers a scalable tool for comparing graph shapes and identifying atypical network structures, with potential applications in network science and beyond.

Abstract

Gromov-Hausdorff distances measure shape difference between the objects representable as compact metric spaces, e.g. point clouds, manifolds, or graphs. Computing any Gromov-Hausdorff distance is equivalent to solving an NP-Hard optimization problem, deeming the notion impractical for applications. In this paper we propose polynomial algorithm for estimating the so-called modified Gromov-Hausdorff (mGH) distance, whose topological equivalence with the standard Gromov-Hausdorff (GH) distance was established in Mémoli F, 2012. We implement the algorithm for the case of compact metric spaces induced by unweighted graphs as part of Python library $\verb|scikit-tda|$, and demonstrate its performance on real-world and synthetic networks. The algorithm finds the mGH distances exactly on most graphs with the scale-free property. We use the computed mGH distances to successfully detect outliers in real-world social and computer networks.

Efficient estimation of the modified Gromov-Hausdorff distance between unweighted graphs

TL;DR

The paper tackles the computational intractability of the Gromov–Hausdorff distance by focusing on its modified variant and leveraging curvature sets into a structural decomposition. It develops polynomial-time lower-bound procedures and an algorithm to estimate , with a concrete implementation for metric spaces induced by unweighted graphs in the scikit-tda library. Empirical results on real-world networks (e.g., Enron, LANL, ABIDE I) and synthetic graphs show frequent exact recovery and meaningful distance-based outlier detection, highlighting practical usefulness for graph-shape matching and anomaly analysis. The work delivers a scalable tool for comparing graph shapes and identifying atypical network structures, with potential applications in network science and beyond.

Abstract

Gromov-Hausdorff distances measure shape difference between the objects representable as compact metric spaces, e.g. point clouds, manifolds, or graphs. Computing any Gromov-Hausdorff distance is equivalent to solving an NP-Hard optimization problem, deeming the notion impractical for applications. In this paper we propose polynomial algorithm for estimating the so-called modified Gromov-Hausdorff (mGH) distance, whose topological equivalence with the standard Gromov-Hausdorff (GH) distance was established in Mémoli F, 2012. We implement the algorithm for the case of compact metric spaces induced by unweighted graphs as part of Python library , and demonstrate its performance on real-world and synthetic networks. The algorithm finds the mGH distances exactly on most graphs with the scale-free property. We use the computed mGH distances to successfully detect outliers in real-world social and computer networks.

Paper Structure

This paper contains 36 sections, 3 theorems, 37 equations, 5 figures, 3 tables.

Key Result

Theorem 1

Let $K \in \mathcal{K}_n(X)$ be $d$-bounded for some $d > 0$. If $n > |Y|$, then $\widehat{d}_{\mathcal{GH}}(X, Y) \geq \frac{d}{2}$.

Figures (5)

  • Figure 1: Illustration of the idea underlying the Gromov--Hausdorff distance.
  • Figure 2: Outlier probabilities assigned to the weekly email exchange networks. Red indicates outlier probabilities $> 0.99$, corresponding to the weeks of Sep 17, Oct 29, Nov 5, and Nov 26 in the year 2001.
  • Figure 3: Frequency of red team events in daily authentication activity of the selected users. Grey indicates days of no authentication activity by user. Dashed line separates the two groups of 20 users.
  • Figure 4: Outlier probability assigned to user-based daily authentication graphs. Red indicates outlier probabilities $> 0.999$. Grey indicates empty graphs (excluded from analysis). The dashed line separates the two groups of 20 users.
  • Figure 5: Outlier probability assigned to brain networks of study subjects. Blue indicates outlier probabilities $> 0.95$, and red --- outlier probabilities $> 0.999$. The latter correspond to the subjects MaxMun_c_0051332, MaxMun_b_0051323, and MaxMun_c_0051335. The remaining outlier probabilities are 0.

Theorems & Definitions (21)

  • Remark
  • Remark
  • Claim 1
  • Claim 2
  • Theorem 1
  • proof
  • Remark
  • Lemma 1
  • proof
  • Remark
  • ...and 11 more