The Wasserstein gradient flow of the Sinkhorn divergence between Gaussian distributions

Mathis Hardion; Théo Lacombe

The Wasserstein gradient flow of the Sinkhorn divergence between Gaussian distributions

Mathis Hardion, Théo Lacombe

TL;DR

This work analyzes the Wasserstein gradient flow of the Sinkhorn divergence $S_\varepsilon(\cdot,\mu_\star)$ when both source and target are Gaussian, establishing existence, Gaussian-invariance, and uniqueness (within a regular measure class) of the flow. It derives explicit mean and covariance dynamics, proving global convergence to the target when the initial covariance is non-singular and detailing limit behavior when singular or commuting covariances arise; in the commuting case, it shows exponential convergence for full-support targets and $O(t^{-1})$ rates when the target lies in a subspace. The results connect closed-form Gaussian Sinkhorn formulas with the Bures–Wasserstein geometry, yielding precise eigenvalue evolutions and energy-dissipation relations, and are complemented by explicit time-discretization schemes and numerical experiments. Collectively, the paper provides a rigorous partial convergence theory for Gaussian Sinkhorn flows and offers practical guidance for simulations and potential extensions to broader settings.

Abstract

We study the Wasserstein gradient flow of the Sinkhorn divergence when both the source and the target are Gaussian distributions. We prove the existence of a flow that stays in the class of Gaussian distributions, and is unique in the larger class of measures with strongly-concave and smooth log-densities. We prove that the flow globally converges toward the target measure when the source's covariance matrix is not singular, and provide counter-examples to global convergence when it is, giving a first answer to an open question raised in [Carlier et al. 2024, \S4.2]. When the covariance matrix of the source distribution commutes with that of the target, we derive more quantitative results that showcase exponential convergence toward the target when the source and the target share their support, but dropping to linear rates (O(t^{-1})) if the target is concentrated on a strict subspace of the source's support.

The Wasserstein gradient flow of the Sinkhorn divergence between Gaussian distributions

TL;DR

This work analyzes the Wasserstein gradient flow of the Sinkhorn divergence

when both source and target are Gaussian, establishing existence, Gaussian-invariance, and uniqueness (within a regular measure class) of the flow. It derives explicit mean and covariance dynamics, proving global convergence to the target when the initial covariance is non-singular and detailing limit behavior when singular or commuting covariances arise; in the commuting case, it shows exponential convergence for full-support targets and

rates when the target lies in a subspace. The results connect closed-form Gaussian Sinkhorn formulas with the Bures–Wasserstein geometry, yielding precise eigenvalue evolutions and energy-dissipation relations, and are complemented by explicit time-discretization schemes and numerical experiments. Collectively, the paper provides a rigorous partial convergence theory for Gaussian Sinkhorn flows and offers practical guidance for simulations and potential extensions to broader settings.

Abstract

Paper Structure (26 sections, 17 theorems, 68 equations, 4 figures)

This paper contains 26 sections, 17 theorems, 68 equations, 4 figures.

Introduction
Outline and Contributions.
Related work
Optimal transport and its entropic regularization.
Wasserstein gradient flows.
The gradient flow of $S_\varepsilon$.
Distinction from Sinkhorn geodesics, barycenters, and Schrödinger bridges.
Optimal transport and optimization on the Bures--Wasserstein space.
Notations and setting
Background and preliminary results
Wasserstein Gradient Flows
Preliminary results on EOT between Gaussian measures
Well-posedness of the flow
Existence
Characterization as the Wasserstein gradient flow of the Sinkhorn divergence
...and 11 more sections

Key Result

Theorem 1.1

Let $\mu_0, \mu_\star$ be Gaussian measures, and $\mathrm{supp}(\mu_0)$, $\mathrm{supp}(\mu_\star)$ their respective supports. There exists a unique solution $(\mu_t)_t$ of eq:SWGF which stays Gaussian, it is a Wasserstein gradient flow of $S_\varepsilon(\cdot, \mu_\star)$ in the sense of ambrosio20

Figures (4)

Figure 1: Covariance ellipses of the flow for a non-singular source and a non-singular (left) vs singular (right) target (red). The gray grid lines are spaced by $\sqrt{\varepsilon}$.
Figure 2: Covariance ellipses for singular Gaussian distributions, in an orthogonal configuration (left, with y-axis marginal for visual clarity) vs. slightly rotated (right).
Figure 3: Values of $S_\varepsilon(\mu_t, \mu_\star)/S_\varepsilon(\mu_0,\mu_\star)$ over time for $\Sigma_0 =\mathrm{Id}$ (commuting case, left) and $\Sigma_0$ the same as in \ref{['fig:ellipses-non-singular']} (non-commuting case, middle), to $\Sigma_\star = \mathrm{diag}((0, \lambda^\star))$ for different values of $\lambda^\star$, as a semi-log plot. On the right, the conditions are the same as the middle but for a longer time interval and in log-log scale.
Figure 4: Values of $S_\varepsilon(\mu_t, \mu_\star)/S_\varepsilon(\mu_0,\mu_\star)$ over time for different values of $\varepsilon$ (same source and target as in the left of \ref{['fig:ellipses-non-singular']}).

Theorems & Definitions (37)

Theorem 1.1
Proposition 1
Lemma 1
proof
proof : Proof of \ref{['thm:Seps-gauss']}
Lemma 2
proof
Theorem 3.1
Theorem 3.2
Lemma 3
...and 27 more

The Wasserstein gradient flow of the Sinkhorn divergence between Gaussian distributions

TL;DR

Abstract

The Wasserstein gradient flow of the Sinkhorn divergence between Gaussian distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (37)