Computing Optimal Transport Maps and Wasserstein Barycenters Using Conditional Normalizing Flows
Gabriele Visentin, Patrick Cheridito
TL;DR
The paper introduces a primal, generative approach to compute optimal transport maps and Wasserstein-2 barycenters in high dimensions by training conditional normalizing flows that map input distributions to a common latent space. By enforcing pushforward constraints through likelihood-based objectives, the method avoids dual/adversarial training and yields both OT maps and a generative model of the barycenter with h(Z)=∑_s w_s f(Z,s). Theoretical results connect OT distances to L^p(λ) differences and show that the barycenter arises as a conditional expectation minimizing variance, enabling scalable computation for hundreds of input distributions. Empirically, the approach achieves high accuracy across Gaussian, uniform, Swiss-roll, MNIST, and large-n datasets, often outperforming state-of-the-art baselines in both quality and scalability. The framework thus enables practical, sample-efficient, and scalable barycenter construction and transport in high-dimensional settings, with broad applicability in statistics, imaging, and fairness.
Abstract
We present a novel method for efficiently computing optimal transport maps and Wasserstein barycenters in high-dimensional spaces. Our approach uses conditional normalizing flows to approximate the input distributions as invertible pushforward transformations from a common latent space. This makes it possible to directly solve the primal problem using gradient-based minimization of the transport cost, unlike previous methods that rely on dual formulations and complex adversarial optimization. We show how this approach can be extended to compute Wasserstein barycenters by solving a conditional variance minimization problem. A key advantage of our conditional architecture is that it enables the computation of barycenters for hundreds of input distributions, which was computationally infeasible with previous methods. Our numerical experiments illustrate that our approach yields accurate results across various high-dimensional tasks and compares favorably with previous state-of-the-art methods.
