Differentiable Generalized Sliced Wasserstein Plans

Laetitia Chapel; Romain Tavenard; Samuel Vaiter

Differentiable Generalized Sliced Wasserstein Plans

Laetitia Chapel, Romain Tavenard, Samuel Vaiter

TL;DR

This work tackles the computational bottleneck of optimal transport plan computation by introducing Differentiable Generalized Sliced Wasserstein Plans (DGSWP), which extend slicing-based OT with non-linear projections and a differentiable bilevel optimization framework. By leveraging a Stein-based smoothing of the outer objective, it provides a differentiable, GPU-efficient surrogate that yields meaningful transport plans even in high dimensions and on manifolds, with gradient information guiding the projection map. Empirically, DGSWP improves transport costs over prior sliced methods, enables robust gradient flows in Euclidean and hyperbolic spaces, and enhances image-generation workflows by replacing costly mini-batch OT in conditional flow matching. The approach offers practical impact for scalable OT in large-scale learning tasks, including manifold-valued data and generative modeling, while opening directions for ensuring projection injectivity and exploring injective neural architectures.

Abstract

Optimal Transport (OT) has attracted significant interest in the machine learning community, not only for its ability to define meaningful distances between probability distributions -- such as the Wasserstein distance -- but also for its formulation of OT plans. Its computational complexity remains a bottleneck, though, and slicing techniques have been developed to scale OT to large datasets. Recently, a novel slicing scheme, dubbed min-SWGG, lifts a single one-dimensional plan back to the original multidimensional space, finally selecting the slice that yields the lowest Wasserstein distance as an approximation of the full OT plan. Despite its computational and theoretical advantages, min-SWGG inherits typical limitations of slicing methods: (i) the number of required slices grows exponentially with the data dimension, and (ii) it is constrained to linear projections. Here, we reformulate min-SWGG as a bilevel optimization problem and propose a differentiable approximation scheme to efficiently identify the optimal slice, even in high-dimensional settings. We furthermore define its generalized extension for accommodating to data living on manifolds. Finally, we demonstrate the practical value of our approach in various applications, including gradient flows on manifolds and high-dimensional spaces, as well as a novel sliced OT-based conditional flow matching for image generation -- where fast computation of transport plans is essential.

Differentiable Generalized Sliced Wasserstein Plans

TL;DR

Abstract

Differentiable Generalized Sliced Wasserstein Plans

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (12)