Table of Contents
Fetching ...

Computing Wasserstein Barycenters through Gradient Flows

Eduardo Fernandes Montesuma, Yassir Bendou, Mike Gartrell

TL;DR

The paper advances Wasserstein barycenters by casting them as gradient flows in Wasserstein space, enabling mini-batch sampling and regularization through internal, potential, and interaction energies. It provides two practical instantiations— Empirical Flow for particle-based barycenters and Gaussian Mixture Flow for mixtures—along with convergence guarantees under a Polyak-Łojasiewicz condition and error bounds for empirical approximations. The approach supports joint measures with label-aware ground costs, yielding a decomposition of the barycenter problem into feature and label components and achieving robust performance in toy tests and challenging multi-source domain adaptation benchmarks. Overall, the work delivers a scalable, theoretically grounded framework that outperforms prior discrete and neural barycenter methods, particularly when leveraging label information during domain adaptation.

Abstract

Wasserstein barycenters provide a powerful tool for aggregating probability measures, while leveraging the geometry of their ambient space. Existing discrete methods suffer from poor scalability, as they require access to the complete set of samples from input measures. We address this issue by recasting the original barycenter problem as a gradient flow in the Wasserstein space. Our approach offers two advantages. First, we achieve scalability by sampling mini-batches from the input measures. Second, we incorporate functionals over probability measures, which regularize the barycenter problem through internal, potential, and interaction energies. We present two algorithms for empirical and Gaussian mixture measures, providing convergence guarantees under the Polyak-Łojasiewicz inequality. Experimental validation on toy datasets and domain adaptation benchmarks show that our methods outperform previous discrete and neural net-based methods for computing Wasserstein barycenters.

Computing Wasserstein Barycenters through Gradient Flows

TL;DR

The paper advances Wasserstein barycenters by casting them as gradient flows in Wasserstein space, enabling mini-batch sampling and regularization through internal, potential, and interaction energies. It provides two practical instantiations— Empirical Flow for particle-based barycenters and Gaussian Mixture Flow for mixtures—along with convergence guarantees under a Polyak-Łojasiewicz condition and error bounds for empirical approximations. The approach supports joint measures with label-aware ground costs, yielding a decomposition of the barycenter problem into feature and label components and achieving robust performance in toy tests and challenging multi-source domain adaptation benchmarks. Overall, the work delivers a scalable, theoretically grounded framework that outperforms prior discrete and neural barycenter methods, particularly when leveraging label information during domain adaptation.

Abstract

Wasserstein barycenters provide a powerful tool for aggregating probability measures, while leveraging the geometry of their ambient space. Existing discrete methods suffer from poor scalability, as they require access to the complete set of samples from input measures. We address this issue by recasting the original barycenter problem as a gradient flow in the Wasserstein space. Our approach offers two advantages. First, we achieve scalability by sampling mini-batches from the input measures. Second, we incorporate functionals over probability measures, which regularize the barycenter problem through internal, potential, and interaction energies. We present two algorithms for empirical and Gaussian mixture measures, providing convergence guarantees under the Polyak-Łojasiewicz inequality. Experimental validation on toy datasets and domain adaptation benchmarks show that our methods outperform previous discrete and neural net-based methods for computing Wasserstein barycenters.

Paper Structure

This paper contains 38 sections, 8 theorems, 132 equations, 13 figures, 10 tables, 3 algorithms.

Key Result

Proposition 3.1

Let $P = \sum \pi_{i}^{(P)}(\mathcal{N}(\mu_{i}^{(P)}, \Sigma_{i}^{(P)}) \otimes \delta_{\nu_{i}^{(P)}})$ and $Q = \sum \pi_{j}^{(Q)}(\mathcal{N}(\mu_{j}^{(Q)}, \Sigma_{j}^{(Q)} \otimes \delta_{\nu_{j}^{(Q)}})$ be two over $\Omega = \mathcal{X} \times \mathcal{Y}$. Let the ground cost $c$ be, where $\rho(y, y')$ is a metric over $\mathcal{Y}$. Then, where $C_{ij} = \mathbb{W}_{2}(P_{i,x}, Q_{j,x

Figures (13)

  • Figure 1: In (a), we show the usual static notion of the Wasserstein barycenter (in red), which minimizes the sum of distances to the input measures (in blue). In (b), we show our notion of barycenter as a gradient flow, flowing an initial measure $P_{0}$ (purple) to the barycenter $P^{\star}$ (yellow) of the input measures.
  • Figure 2: Location-scatter family generated by a Swiss-roll measure, $Q_{0}$. Each measure $Q_{k} = T_{k,\sharp}Q_{0}$, for $T_{k}(x) = A_{k}x + b_{k}$.
  • Figure 3: Comparison between Wasserstein barycenter solvers. Colored scatter plots indicate labeled barycenters. For each solver, we compute the Wasserstein distance between its solution $\hat{P}$ and the ground-truth $P^{\star}$, shown in the title of each sub figure (best seen on screen). Overall, using label information leads to barycenters that better approximate the ground-truth barycenter.
  • Figure 4: t-SNE maaten2008visualizing visualization of the barycenter of $[\text{Art}, \text{Product}, \text{Real-World}]$ source domains in the Office home benchmark. Colors represent different classes from $1$ to $65$. Overall, the classes tend to be more separated when using the repulsion functional $\mathbb{U}$.
  • Figure 4: Office 31.
  • ...and 8 more figures

Theorems & Definitions (17)

  • Remark 3.1
  • Proposition 3.1
  • Theorem 3.1
  • Theorem 3.2
  • Definition C.1
  • Proposition D.1
  • proof
  • Proposition D.2
  • proof
  • Theorem D.1
  • ...and 7 more