Table of Contents
Fetching ...

Gromov-Wasserstein Graph Coarsening

Carlos A. Taveras, Santiago Segarra, César A. Uribe

TL;DR

This work advances graph coarsening by framing it through the Gromov-Wasserstein distance, enabling unaligned graphs of different sizes to be coarsened while preserving global relational structure. It introduces two algorithms: Greedy Pair Coarsening (GPC), which greedily merges node pairs to recover the minimal GW-representative, and KGPC, a scalable variant using a distortion-based, $k$-means clustering approach on a pairwise distortion matrix $H$. The authors provide theoretical guarantees for GPC under separation conditions and demonstrate empirical advantages over baselines in reconstruction distortion and downstream graph classification across multiple datasets. The proposed GW-centric coarsening framework offers a principled, geometry-aware tool for reducing graph size in non-Euclidean data while maintaining essential topology and transport-based alignments.

Abstract

We study the problem of graph coarsening within the Gromov-Wasserstein geometry. Specifically, we propose two algorithms that leverage a novel representation of the distortion induced by merging pairs of nodes. The first method, termed Greedy Pair Coarsening (GPC), iteratively merges pairs of nodes that locally minimize a measure of distortion until the desired size is achieved. The second method, termed $k$-means Greedy Pair Coarsening (KGPC), leverages clustering based on pairwise distortion metrics to directly merge clusters of nodes. We provide conditions guaranteeing optimal coarsening for our methods and validate their performance on six large-scale datasets and a downstream clustering task. Results show that the proposed methods outperform existing approaches on a wide range of parameters and scenarios.

Gromov-Wasserstein Graph Coarsening

TL;DR

This work advances graph coarsening by framing it through the Gromov-Wasserstein distance, enabling unaligned graphs of different sizes to be coarsened while preserving global relational structure. It introduces two algorithms: Greedy Pair Coarsening (GPC), which greedily merges node pairs to recover the minimal GW-representative, and KGPC, a scalable variant using a distortion-based, -means clustering approach on a pairwise distortion matrix . The authors provide theoretical guarantees for GPC under separation conditions and demonstrate empirical advantages over baselines in reconstruction distortion and downstream graph classification across multiple datasets. The proposed GW-centric coarsening framework offers a principled, geometry-aware tool for reducing graph size in non-Euclidean data while maintaining essential topology and transport-based alignments.

Abstract

We study the problem of graph coarsening within the Gromov-Wasserstein geometry. Specifically, we propose two algorithms that leverage a novel representation of the distortion induced by merging pairs of nodes. The first method, termed Greedy Pair Coarsening (GPC), iteratively merges pairs of nodes that locally minimize a measure of distortion until the desired size is achieved. The second method, termed -means Greedy Pair Coarsening (KGPC), leverages clustering based on pairwise distortion metrics to directly merge clusters of nodes. We provide conditions guaranteeing optimal coarsening for our methods and validate their performance on six large-scale datasets and a downstream clustering task. Results show that the proposed methods outperform existing approaches on a wide range of parameters and scenarios.

Paper Structure

This paper contains 13 sections, 5 theorems, 44 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

Given a measure network $G$, GPC recovers the smallest network weakly isomorphic to $G$. Moreover, when $k$ is the size of the minimal representative, GPC($G, N-k$) solves Problem eq:coarsening_formulation.

Figures (2)

  • Figure 1: For each method and graph in a dataset, we coarsen between 15% and 85% of nodes, and average the distortion over all graphs. GPC and KGPC achieve the overall lowest or near-lowest distortion on MSRC, Enzymes, and PTC-MR, whereas MGC performs best on MUTAG.
  • Figure 2: We show here two weak isomorphism classes of graphs. The leftmost networks in each class have uniform mass on nodes and the weights of all edges in each class are equal. The rightmost graphs are minimal representatives or terminal networks in their respective class. Visually, the minimal representatives of the graphs $G_1$ and $G_2$ appear quite similar, but comparing their leftmost representation reveals how different the graphs are. Classes $G_1$ and $G_2$ are both examples of complete bi-partite graphs; these classes of graphs benefit most from coarsening to the minimal representative, as a complete $k$-partite network can be reduced to a $k$-node minimal representative.

Theorems & Definitions (8)

  • Proposition 1
  • Proposition 2
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof