Table of Contents
Fetching ...

Gromov-Wasserstein Barycenters: The Analysis Problem

Rocío Díaz Martín, Ivan V. Medri, James M. Murphy

Abstract

This paper considers the problem of estimating a matrix that encodes pairwise distances in a finite metric space (or, more generally, the edge weight matrix of a network) under the barycentric coding model (BCM) with respect to the Gromov-Wasserstein (GW) distance function. We frame this task as estimating the unknown barycentric coordinates with respect to the GW distance, assuming that the target matrix (or kernel) belongs to the set of GW barycenters of a finite collection of known templates. In the language of harmonic analysis, if computing GW barycenters can be viewed as a synthesis problem, this paper aims to solve the corresponding analysis problem. We propose two methods: one utilizing fixed-point iteration for computing GW barycenters, and another employing a differentiation-based approach to the GW structure using a blow-up technique. Finally, we demonstrate the application of the proposed GW analysis approach in a series of numerical experiments and applications to machine learning.

Gromov-Wasserstein Barycenters: The Analysis Problem

Abstract

This paper considers the problem of estimating a matrix that encodes pairwise distances in a finite metric space (or, more generally, the edge weight matrix of a network) under the barycentric coding model (BCM) with respect to the Gromov-Wasserstein (GW) distance function. We frame this task as estimating the unknown barycentric coordinates with respect to the GW distance, assuming that the target matrix (or kernel) belongs to the set of GW barycenters of a finite collection of known templates. In the language of harmonic analysis, if computing GW barycenters can be viewed as a synthesis problem, this paper aims to solve the corresponding analysis problem. We propose two methods: one utilizing fixed-point iteration for computing GW barycenters, and another employing a differentiation-based approach to the GW structure using a blow-up technique. Finally, we demonstrate the application of the proposed GW analysis approach in a series of numerical experiments and applications to machine learning.

Paper Structure

This paper contains 41 sections, 13 theorems, 118 equations, 19 figures, 2 tables, 4 algorithms.

Key Result

Proposition 3.2

Given $S\in \mathbb{N}$, consider templates $( \mathbf X^s, \mathrm{p}^s)\in \mathbb{R}^{N^s\times N^s}\times\mathcal{P}_{N^s}$ for each $s\in [S]$, and $\lambda\in \Delta_{S-1}$. Let $M\in\mathbb{N}$ and let $\mathrm{q}\in \mathcal{P}_M$ be a probability vector. For each $s\in [S]$, let $\pi^s$ be Then, the minimization problem $\min_{\mathbf Y\in \mathbb{R}^{M\times M}} J_\lambda (\mathbf Y)$ i

Figures (19)

  • Figure 1: Illustration of GW barycenters between two point clouds from the Point Cloud MNIST 2D dataset Garcia2023PointCloudMNIST2D, shown for various $t \in [0,1]$ with interpolation coordinates $(1-t, t)$ on the horizontal axis. The top subplot uses the function ot.gromov.gromov_barycenters from the POT Library flamary2021pot (based on peyre2016gromov). The bottom subplot uses the blow-up technique from chowdhury2019gromov, which appropriately realigns the nodes and enlarges the template matrices $\mathbf{X}^0$ (zero shape) and $\mathbf{X}^1$ (one shape), creating new versions $\mathbf{X}^0_b$ and $\mathbf{X}^1_b$ of the same size, so that the GW barycenters can be interpreted as convex combinations $(1-t)\mathbf{X}^0_b + t\mathbf{X}^1_b$ (see Remark \ref{['rem: geod']} in Suppl. Mat. \ref{['app: weak']}). MDS embedding is used for visualization.
  • Figure 2: Visualization of the GW barycenter space as GW barycenter coordinates, using the Point Cloud MNIST 2D dataset Garcia2023PointCloudMNIST2D. Two point clouds serve as templates: one for digit 0s at $(1,0)$ and one for digit 1s at $(0,1)$. Using our proposed Algorithm \ref{['alg: analysis']}, we compute GW barycenter coordinates for 200 random samples and plot them with blue dots for label 0 and red dots for label 1. Additionally, three corresponding point clouds are shown for illustration. Point clouds clearly corresponding to a digit 0 lie closer to the 0 template (coordinates near (1,0)), while those representing digit 1 are closer to the 1 template (coordinates near (0,1)).
  • Figure 3: Blow-up illustration.
  • Figure 4: (a) GW barycenter $(\mathbf Y,\mathrm{q})$ consisting of $M=400$ points and uniform mass $\mathrm{q}$, synthesized with the templates above sampled with different number of points between $300$ and $500$. (b) MDS of the matrix given by \ref{['eq: Y analysis alg']} (i.e., $(\mathbf Y_{\textrm{bary}},\mathrm{q})$) after applying the fixed-point analysis Algorithm \ref{['alg: analysis']} to (a) to estimate $\widetilde{\lambda}\approx \lambda$ (error of order $10^{-14}$). (c) The resulting blow-up of (a) (i.e., $(\mathbf Y_{b},\mathrm{q}_b)$) when Algorithm \ref{['alg: blow up']} is applied to the templates and (a). The size $M_b$ was about $1500$, which is of the same order as $400+3N$ where $300 \leq N\leq 500$ (cf. Remark \ref{['remark: blowup_size']}). (d) MDS of the matrix given by \ref{['eq: Y analysis alg blow up']} (i.e., $(\mathbf Y_{\textrm{barybu}},\mathrm{q}_b)$) after applying the blow-up analysis Algorithm \ref{['alg: analysis blow up']} to (a) to recover $\widetilde{\lambda}\approx \lambda$ (error of order $10^{-3}$). (i)--(iv) Analogous to (a)--(d) but uniformly sampling the templates with $M=400$ points. Errors in $\lambda$ estimation for (ii) and (iv) are of order $10^{-15}$ and $10^{-16}$, respectively. Blow-up size $M_b=400$. Difference: (a) does not satisfy the test from Remark \ref{['remark: test']}, while (i) passes the test.
  • Figure 5: Depiction of 100 different randomly generated $\lambda\in \Delta_2$ (dots) and their estimation $\widetilde{\lambda}$ (red crosses) from synthesized GW barycenters, using Algorithm \ref{['alg: analysis']} (left) and Algorithm \ref{['alg: analysis blow up']} (right). Each $\lambda$ is represented as a convex combination of the vertices of an equilateral triangle. We compute mean and standard deviation of the reconstruction error $\lVert \lambda - \widetilde{\lambda} \rVert_2$ and of the average wall-clock time (in seconds) of our GW-analysis methods over $100$ independent runs. For each experiment, we use 3 templates, uniformly sampled at 30 points, with uniform distribution of mass across nodes. The 100 GW barycenters, with $M=30$ points and $\mathrm{q}=\frac{1}{30}(1,\dots,1)\in \mathbb{R}^{30}$ each, were synthesized using ot.gromov.gromov_barycenters from the POT Library. Blow-up average size when using Algorithm \ref{['alg: analysis']}: $M_b=30$. Blue dots (85) represent synthesized output that pass the compatibility test in Remark \ref{['remark: test']}, whereas green dots do not (15). In the right-hand boxed panel we run 10 experiments with templates sampled at varying rates noticing all the synthetic outputs fail to be critical points of the GW synthesis functional (all green dots).
  • ...and 14 more figures

Theorems & Definitions (49)

  • Definition 2.1: memoli2011gromovbeier2022linearsturm2023spacechowdhury2019gromov
  • Definition 2.2: memoli2011gromovchowdhury2019gromov
  • Definition 2.3: chowdhury2019gromov
  • Definition 3.1
  • Proposition 3.2: Prop. 3 peyre2016gromov - Revisited proof in Suppl. Mat. \ref{['app: gw fp synth']}
  • Theorem 3.3
  • proof
  • Corollary 3.4
  • proof
  • Remark 3.5: See also Example \ref{['example: simple']} in Suppl. Mat. \ref{['app: gw fp synth']}
  • ...and 39 more