Table of Contents
Fetching ...

Convergence rates for regularized unbalanced optimal transport: the discrete case

Luca Nenna, Paul Pegon, Louis Tocquec

TL;DR

This work analyzes regularized unbalanced optimal transport in the discrete setting with finite supports, formulating a general framework that couples a transport cost with a convex marginal penalty and an entropy regularization. It derives existence, uniqueness, and duality results for both the regularized and unregularized problems, and studies the ε-trajectory of the primal and dual optima. The authors prove that as ε→0, the dual optimizer converges to the unregularized dual at rate $O(ε)$, while the primal transport plan converges to the entropy-minimizing unregularized optimizer at rate at least $O(\sqrt{ε})$, with a refined asymptotic description for the dual via a rescaled variable $d(ε)$. Numerical experiments across KL and quadratic-type divergences corroborate the theoretical rates, and demonstrate the practical relevance of the asymptotic results for finite-sample discrete UOT problems.

Abstract

Unbalanced optimal transport (UOT) is a natural extension of optimal transport (OT) allowing comparison between measures of different masses. It arises naturally in machine learning by offering a robustness against outliers. The aim of this work is to provide convergence rates of the regularized transport cost and plans towards their original solution when both measures are weighted sums of Dirac masses.

Convergence rates for regularized unbalanced optimal transport: the discrete case

TL;DR

This work analyzes regularized unbalanced optimal transport in the discrete setting with finite supports, formulating a general framework that couples a transport cost with a convex marginal penalty and an entropy regularization. It derives existence, uniqueness, and duality results for both the regularized and unregularized problems, and studies the ε-trajectory of the primal and dual optima. The authors prove that as ε→0, the dual optimizer converges to the unregularized dual at rate , while the primal transport plan converges to the entropy-minimizing unregularized optimizer at rate at least , with a refined asymptotic description for the dual via a rescaled variable . Numerical experiments across KL and quadratic-type divergences corroborate the theoretical rates, and demonstrate the practical relevance of the asymptotic results for finite-sample discrete UOT problems.

Abstract

Unbalanced optimal transport (UOT) is a natural extension of optimal transport (OT) allowing comparison between measures of different masses. It arises naturally in machine learning by offering a robustness against outliers. The aim of this work is to provide convergence rates of the regularized transport cost and plans towards their original solution when both measures are weighted sums of Dirac masses.

Paper Structure

This paper contains 14 sections, 16 theorems, 118 equations, 7 figures.

Key Result

proposition 1

For any $\varepsilon>0$, OT_epsilon and its dual admit a unique solution $\gamma_\varepsilon$ and $\xi_\varepsilon$ (up to the kernel of $A^*$, which has dimension $1$, for $\xi_\varepsilon$), respectively. Moreover, the following relation holds

Figures (7)

  • Figure 1: Three Datasets. The left panel shows two discretized Gaussian measures. The center panel shows two weighted point clouds in 2D. The right panel shows two weighted point clouds in 3D.
  • Figure 2: $\mathop{\mathrm{\mathrm{F}}}\nolimits(\cdot) = \mathrm{D}_\phi(\cdot\vert q)$ with $\phi(t)=t(\log t - 1)$, $\mu$ and $\nu$ are discretized Gaussian measures. The left plot (log scale) represents the behavior of $\varepsilon \mapsto \lVert \gamma(\varepsilon) - \bar{\gamma} \rVert$ (blue) compared to the expected rate (orange). The right plot (log scale) represents $\varepsilon \mapsto \lVert \xi(\varepsilon) - \bar{\xi} \rVert$ (blue) compared to the expected rate (orange).
  • Figure 3: $\mathop{\mathrm{\mathrm{F}}}\nolimits(\cdot) = \tau \mathrm{D}_\phi(\cdot\vert q)$ with $\phi(t)=t(\log t - 1)$ and $\tau=10$, $\mu$ and $\nu$ are measures supported on $2$D point clouds. The left plot (log scale) represents the behavior of $\varepsilon \mapsto \lVert \gamma(\varepsilon) - \bar{\gamma} \rVert$ (blue) compared to the expected rate (orange). The right plot (log scale) represents $\varepsilon \mapsto \lVert \xi(\varepsilon) - \bar{\xi} \rVert$ (blue) compared to the expected rate (orange).
  • Figure 4: $\mathop{\mathrm{\mathrm{F}}}\nolimits(\cdot) = \tau\mathrm{D}_\phi(\cdot\vert q)$ with $\phi(t)=t(\log t - 1)$ and and $\tau=10$, $\mu$ and $\nu$ are measures supported on $3$D point clouds. The left plot (log scale) represents the behavior of $\varepsilon \mapsto \lVert \gamma(\varepsilon) - \bar{\gamma} \rVert$ (blue) compared to the expected rate (orange). The right plot (log scale) represents $\varepsilon \mapsto \lVert \xi(\varepsilon) - \bar{\xi} \rVert$ (blue) compared to the expected rate (orange).
  • Figure 5: $\mathop{\mathrm{\mathrm{F}}}\nolimits(\cdot) = \mathrm{D}_\phi(\cdot\vert q)$ with $\phi(t)=\frac{1}{2}\lvert t-1\rvert^2$, $\mu$ and $\nu$ are discretized Gaussian measures. The left plot (log scale) represents the behavior of $\varepsilon \mapsto \lVert \gamma(\varepsilon) - \bar{\gamma} \rVert$ (blue) compared to the expected rate (orange). The right plot (log scale) represents $\varepsilon \mapsto \lVert \xi(\varepsilon) - \bar{\xi} \rVert$ (blue) compared to the expected rate (orange).
  • ...and 2 more figures

Theorems & Definitions (42)

  • remark 1
  • remark 2
  • proposition 1
  • definition 2.1
  • remark 3
  • remark 4
  • proposition 2
  • proof
  • definition 2.2: Slack variable
  • proposition 3
  • ...and 32 more