Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

Brendan Mallery; James M. Murphy; Shuchin Aeron

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

Brendan Mallery, James M. Murphy, Shuchin Aeron

TL;DR

This work develops entropy-regularized Wasserstein-2 barycenters for both synthesis and analysis of probability measures, introducing a derivative framework that enables fixed-point characterizations of barycenters via average entropic maps. It proves differentiability of the entropic cost with respect to the source measure under mild subgaussian assumptions, and uses this to derive a finite-dimensional convex quadratic program for the analysis problem when the analyzed measure is a barycenter. The paper establishes dimension-free sample complexity for estimating barycenter functionals and coefficients, and demonstrates stability of barycentric representations under perturbations, enabling efficient, sample-efficient measure-based features. The approach is validated on point-cloud data, where barycentric coefficients yield competitive classification performance with limited training data, highlighting practical utility for robust, data-efficient geometric learning tasks.

Abstract

We consider synthesis and analysis of probability measures using the entropy-regularized Wasserstein-2 cost and its unbiased version, the Sinkhorn divergence. The synthesis problem consists of computing the barycenter, with respect to these costs, of reference measures given a set of coefficients belonging to the simplex. The analysis problem consists of finding the coefficients for the closest barycenter in the Wasserstein-2 distance to a given measure. Under the weakest assumptions on the measures thus far in the literature, we compute the derivative of the entropy-regularized Wasserstein-2 cost. We leverage this to establish a characterization of barycenters with respect to the entropy-regularized Wasserstein-2 cost as solutions that correspond to a fixed point of an average of the entropy-regularized displacement maps. This characterization yields a finite-dimensional, convex, quadratic program for solving the analysis problem when the measure being analyzed is a barycenter with respect to the entropy-regularized Wasserstein-2 cost. We show that these coefficients, as well as the value of the barycenter functional, can be estimated from samples with dimension-independent rates of convergence, and that barycentric coefficients are stable with respect to perturbations in the Wasserstein-2 metric. We employ the barycentric coefficients as features for classification of corrupted point cloud data, and show that compared to neural network baselines, our approach is more efficient in small training data regimes.

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

TL;DR

Abstract

Paper Structure (39 sections, 55 theorems, 205 equations, 5 figures, 1 table, 6 algorithms)

This paper contains 39 sections, 55 theorems, 205 equations, 5 figures, 1 table, 6 algorithms.

INTRODUCTION
Notation and Background
Main Contributions
Critical Points and Optimality Criteria for $\mathcal{F}^\epsilon_{\lambda,\mathcal{V}}$
Sample Complexity for Synthesis and Analysis
Application to Point Cloud Classification with Barycentric coefficients
Related Work
ANALYTIC PROPERTIES OF $\mathcal{F}^\epsilon_{\lambda,\mathcal{V}}$
SAMPLE COMPLEXITY OF ESTIMATING $\min_\mu\mathcal{F}^\epsilon_{\lambda,\mathcal{V}}(\mu)$
SAMPLE COMPLEXITY AND STABILITY FOR THE ANALYSIS PROBLEM
Numerical Verification of Theorem \ref{['thm:coeff_theorem']}
POINT CLOUD CLASSIFICATION
CONCLUSIONS AND OPEN PROBLEMS
BACKGROUND
Background on Subgaussian Measures
...and 24 more sections

Key Result

Proposition 2.1

Let $\mathcal{V}=\{\nu_{j}\}_{j=1}^{m}\subset\mathcal{P}_{2}(\Omega)$. Then for any $\epsilon>0$, $F^\epsilon_{\lambda,\mathcal{V}}$ admits a minimizer over $\mathcal{P}_2(\Omega)$, and for any $\sigma>0$, $S^\epsilon_{\lambda,\mathcal{V}}$ admits a minimizer over $\mathcal{G}_\sigma(\Omega)$. If $\

Figures (5)

Figure 1: (Top left) Average $\log$$\ell^2$-loss for two random 1D Gaussian measures with random weights, $\epsilon=2$. (Top right) Average $\log$$\ell^2$-loss for three random 1D Gaussian measures with random weights, $\epsilon=2$. (Bottom left) Average $\log$$\ell^2$-loss for three random 5D Gaussian measures, $\lambda=(0.2201,0.0269,0.7530)$, $\epsilon=1$. (Bottom right) Average $\log$$\ell^2$-loss for three uniform-measures on random 5D cubes, $\lambda=(0.5112,0.4477,0.0411)$, $\epsilon=0.1$.
Figure 2: Average $\log$ squared-$OT_2$-loss for three random 1D Gaussian measures with random weights, $\epsilon=2$.
Figure 3: We show synthesized entropy-regularized barycenters on the left, with the reference measures on the corners. On the right, original coefficients are shown as well as what our analysis algorithm recovers, showing good recovery.
Figure 4: (Left to right) Point clouds from clean, dropout$\_$global$\_$4 and local$\_$dropout$\_$4.
Figure 5: (Top left) Global dropout, $F^\epsilon_{\lambda,\mathcal{V}}$ reconstruction, (top right) global dropout $S^\epsilon_{\lambda,\mathcal{V}}$ reconstruction, (bottom left) local dropout, $F^\epsilon_{\lambda,\mathcal{V}}$ reconstruction, (bottom right) local dropout $S^\epsilon_{\lambda,\mathcal{V}}$ reconstruction. $OT_2$-cost denotes the $OT_2$ distance between the (uniform measures on the) clean point cloud and the reconstruction.

Theorems & Definitions (86)

Definition 1.1
Definition 1.2
Definition 1.3
Proposition 2.1
Theorem 2.2
Proposition 2.3
Corollary 2.4
Corollary 2.5
Lemma 2.6
Proposition 2.7
...and 76 more

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

TL;DR

Abstract

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (86)