Table of Contents
Fetching ...

Estimating Barycenters of Distributions with Neural Optimal Transport

Alexander Kolesov, Petr Mokrov, Igor Udovichenko, Milena Gazdieva, Gudmund Pammer, Evgeny Burnaev, Alexander Korotin

TL;DR

This work addresses the problem of aggregating multiple probability distributions by estimating Wasserstein barycenters using a Neural OT framework. It introduces a max-min dual formulation for weak OT barycenters, parameterizes transport plans with stochastic maps, and applies a bi-level optimization that handles general cost functions beyond the quadratic case. Theoretical error bounds are provided for approximate solutions under classical, $\epsilon$-KL, and $\gamma$-Energy cost families, and the method is demonstrated on high-dimensional data including image spaces and StyleGAN latent manifolds. Empirically, the approach yields accurate and flexible barycenters across 2D and real-data experiments (Shape-Color, Ave, CelebA!), with competitive FID/UVP metrics and favorable training/inference characteristics; the code is publicly available for reproducibility.

Abstract

Given a collection of probability measures, a practitioner sometimes needs to find an "average" distribution which adequately aggregates reference distributions. A theoretically appealing notion of such an average is the Wasserstein barycenter, which is the primal focus of our work. By building upon the dual formulation of Optimal Transport (OT), we propose a new scalable approach for solving the Wasserstein barycenter problem. Our methodology is based on the recent Neural OT solver: it has bi-level adversarial learning objective and works for general cost functions. These are key advantages of our method since the typical adversarial algorithms leveraging barycenter tasks utilize tri-level optimization and focus mostly on quadratic cost. We also establish theoretical error bounds for our proposed approach and showcase its applicability and effectiveness in illustrative scenarios and image data setups. Our source code is available at https://github.com/justkolesov/NOTBarycenters.

Estimating Barycenters of Distributions with Neural Optimal Transport

TL;DR

This work addresses the problem of aggregating multiple probability distributions by estimating Wasserstein barycenters using a Neural OT framework. It introduces a max-min dual formulation for weak OT barycenters, parameterizes transport plans with stochastic maps, and applies a bi-level optimization that handles general cost functions beyond the quadratic case. Theoretical error bounds are provided for approximate solutions under classical, -KL, and -Energy cost families, and the method is demonstrated on high-dimensional data including image spaces and StyleGAN latent manifolds. Empirically, the approach yields accurate and flexible barycenters across 2D and real-data experiments (Shape-Color, Ave, CelebA!), with competitive FID/UVP metrics and favorable training/inference characteristics; the code is publicly available for reproducibility.

Abstract

Given a collection of probability measures, a practitioner sometimes needs to find an "average" distribution which adequately aggregates reference distributions. A theoretically appealing notion of such an average is the Wasserstein barycenter, which is the primal focus of our work. By building upon the dual formulation of Optimal Transport (OT), we propose a new scalable approach for solving the Wasserstein barycenter problem. Our methodology is based on the recent Neural OT solver: it has bi-level adversarial learning objective and works for general cost functions. These are key advantages of our method since the typical adversarial algorithms leveraging barycenter tasks utilize tri-level optimization and focus mostly on quadratic cost. We also establish theoretical error bounds for our proposed approach and showcase its applicability and effectiveness in illustrative scenarios and image data setups. Our source code is available at https://github.com/justkolesov/NOTBarycenters.
Paper Structure (26 sections, 2 theorems, 62 equations, 13 figures, 7 tables, 1 algorithm)

This paper contains 26 sections, 2 theorems, 62 equations, 13 figures, 7 tables, 1 algorithm.

Key Result

Theorem 4.1

The optimal value $\mathcal{L}^*$ of the OT barycenter problem eq:weakbary_primal is given by the following $\max$-$\min$ objective:

Figures (13)

  • Figure 1: 2D Twister experiment (§\ref{['subsec:twister']}). Contours represent the prior distribution $\mu_0$.
  • Figure 2: Our learned stochastic maps to the OT barycenter in the Shape-Color experiment (§\ref{['subsec:colorshape']}).
  • Figure 3: Samples from the StyleGAN $G$ (which represents manifold $\mathcal{M}$) trained on colored MNIST digits "2" &"3".
  • Figure 4: Learned (stochastic) maps to the OT barycenter by different solvers; Ave, Celeba! experiment (§\ref{['subsec:ave_celeba']}).
  • Figure 5: Learned (stochastic) maps to the OT barycenter by different solvers; MNIST 0/1 experiment (§\ref{['app:mnist_01']}).
  • ...and 8 more figures

Theorems & Definitions (4)

  • Theorem 4.1: $\max$-$\min$ formulation for OT barycenter
  • Theorem 4.2: Quality bounds for recovered plans
  • proof
  • proof