Table of Contents
Fetching ...

Gaussian entropic optimal transport: Schrödinger bridges and the Sinkhorn algorithm

O. Deniz Akyildiz, Pierre Del Moral, Joaquín Miguez

TL;DR

This work develops a self-contained, finite-dimensional framework for entropic optimal transport between Gaussian marginals by exploiting Gaussian conjugacy and Riccati recursions. It yields closed-form Schrödinger bridges and explicit Sinkhorn updates that mirror Kalman filtering in discrete time, with rigorous exponential convergence rates and quantitative entropy/Wasserstein bounds. The approach provides both theoretical insight and practical algorithms, including pseudocode and simulations, and elucidates how regularization drives independence or Monge-map regimes. The results bridge entropic transport, Bayesian filtering, and diffusion-model perspectives, offering a tractable, scalable Gaussian baseline and laying groundwork for extensions beyond Gaussianity.

Abstract

Entropic optimal transport problems are regularized versions of optimal transport problems. These models play an increasingly important role in machine learning and generative modelling. For finite spaces, these problems are commonly solved using Sinkhorn algorithm (a.k.a. iterative proportional fitting procedure). However, in more general settings the Sinkhorn iterations are based on nonlinear conditional/conjugate transformations and exact finite-dimensional solutions cannot be computed. This article presents a finite-dimensional recursive formulation of the iterative proportional fitting procedure for general Gaussian multivariate models. As expected, this recursive formulation is closely related to the celebrated Kalman filter and related Riccati matrix difference equations, and it yields algorithms that can be implemented in practical settings without further approximations. We extend this filtering methodology to develop a refined and self-contained convergence analysis of Gaussian Sinkhorn algorithms, including closed form expressions of entropic transport maps and Schrödinger bridges.

Gaussian entropic optimal transport: Schrödinger bridges and the Sinkhorn algorithm

TL;DR

This work develops a self-contained, finite-dimensional framework for entropic optimal transport between Gaussian marginals by exploiting Gaussian conjugacy and Riccati recursions. It yields closed-form Schrödinger bridges and explicit Sinkhorn updates that mirror Kalman filtering in discrete time, with rigorous exponential convergence rates and quantitative entropy/Wasserstein bounds. The approach provides both theoretical insight and practical algorithms, including pseudocode and simulations, and elucidates how regularization drives independence or Monge-map regimes. The results bridge entropic transport, Bayesian filtering, and diffusion-model perspectives, offering a tractable, scalable Gaussian baseline and laying groundwork for extensions beyond Gaussianity.

Abstract

Entropic optimal transport problems are regularized versions of optimal transport problems. These models play an increasingly important role in machine learning and generative modelling. For finite spaces, these problems are commonly solved using Sinkhorn algorithm (a.k.a. iterative proportional fitting procedure). However, in more general settings the Sinkhorn iterations are based on nonlinear conditional/conjugate transformations and exact finite-dimensional solutions cannot be computed. This article presents a finite-dimensional recursive formulation of the iterative proportional fitting procedure for general Gaussian multivariate models. As expected, this recursive formulation is closely related to the celebrated Kalman filter and related Riccati matrix difference equations, and it yields algorithms that can be implemented in practical settings without further approximations. We extend this filtering methodology to develop a refined and self-contained convergence analysis of Gaussian Sinkhorn algorithms, including closed form expressions of entropic transport maps and Schrödinger bridges.

Paper Structure

This paper contains 46 sections, 42 theorems, 546 equations, 6 figures, 1 algorithm.

Key Result

Lemma 2.2

The conjugate formula holds for any parameter set $\theta\in \Theta$.

Figures (6)

  • Figure 1: Evolution over time from $n=2$ to $n=24$ in steps of 2. The solid blue and red contours denote the distributions $\nu_{m, \sigma}$ (blue) and $\nu_{\overline{m},\overline{\sigma}}$ (red). The transparent contours shows Gaussian distributions that approximate the end points of the bridge iteratively. It can be seen that, from Iteration 2, Algorithm \ref{['alg:gaussian-sinkhorn']} exhibits fast convergence to the distributions $\nu_{m,\sigma}$ and $\nu_{\overline{m},\overline{\sigma}}$, completely overlapping with the targets in around 10 iterations.
  • Figure 2: A numerical demonstration of the Schrödinger bridge from $\nu_{m, \sigma}$ to $\nu_{\overline{m},\overline{\sigma}}$ using samples from $\nu_{m,\sigma}$.
  • Figure 3: A demonstration of the convergence rates derived in the paper for the 2D example introduced above. On the left, one can see a numerical demonstration of Theorem \ref{['theo-qs']}. In the middle and right, one can see a numerical demonstration of Corollary \ref{['theo-cor-qs']}, indicating the rates we have derived are sharp, and constants $c_\theta$ are small since in the plotting it is ignored. Dashed lines are included just for a clearer visual demonstration of the rates w.r.t. the blue curves.
  • Figure 4: On the left, we demonstrate the value of $\rho_\theta$ we obtain w.r.t. the regularization parameter $t$. It can be seen that $\rho_\theta$ decays to $0$ exponentially fast, compared to the rate $1/2$ found in chiarini. On the right, we demonstrate the convergence bound $\rho_\theta^{n/2}\|m_0 - \overline{m}\|$ with our $\rho_\theta$ estimates vs. $\rho = 0.5$.
  • Figure 5: Contraction coefficient $\rho_\theta$ and eigenvalue $\lambda_{\rm min}(r_\theta+\varpi_\theta)$ for $\sigma=\left[ t001 \right]$ (left) and $\sigma = tI$ (right). The reference parameter $\theta=(\alpha,\kappa,\tau)$ for this simulation is given by $\alpha=[0,0]'$, $\kappa = I_2$ and $\tau=I_2$.
  • ...and 1 more figures

Theorems & Definitions (62)

  • Definition 2.1
  • Lemma 2.2
  • Remark 2.3
  • Remark 2.4
  • Theorem 3.1
  • Remark 3.2
  • Theorem 3.3
  • Corollary 3.4
  • Remark 3.5
  • Corollary 3.6
  • ...and 52 more