Table of Contents
Fetching ...

Linear Optimal Partial Transport Embedding

Yikun Bai, Ivan Medri, Rocio Diaz Martin, Rana Muhammad Shahroz Khan, Soheil Kolouri

TL;DR

This work addresses the computational burden of comparing nonnegative measures with unequal mass by extending optimal transport theory to optimal partial transport (OPT). It introduces Linear Optimal Partial Transport ($LOPT$), a linearized embedding of measures into a Euclidean tangent space built from OPT's dynamic formulation and barycentric projections, enabling efficient computation of OPT-based similarities. The paper develops both continuous and discrete formulations, defines the $LOPT$ embedding and a corresponding discrepancy, and constructs an OPT interpolation analogous to LOT geodesics. It demonstrates the approach on tasks such as fast OPT distance approximation, point-cloud interpolation, and PCA analysis, reporting improved robustness to noise and substantial computational savings over exact OPT while preserving transport structure. These results suggest practical impact for large-scale, mass-variable measure comparisons and related data-analysis tasks.

Abstract

Optimal transport (OT) has gained popularity due to its various applications in fields such as machine learning, statistics, and signal processing. However, the balanced mass requirement limits its performance in practical problems. To address these limitations, variants of the OT problem, including unbalanced OT, Optimal partial transport (OPT), and Hellinger Kantorovich (HK), have been proposed. In this paper, we propose the Linear optimal partial transport (LOPT) embedding, which extends the (local) linearization technique on OT and HK to the OPT problem. The proposed embedding allows for faster computation of OPT distance between pairs of positive measures. Besides our theoretical contributions, we demonstrate the LOPT embedding technique in point-cloud interpolation and PCA analysis.

Linear Optimal Partial Transport Embedding

TL;DR

This work addresses the computational burden of comparing nonnegative measures with unequal mass by extending optimal transport theory to optimal partial transport (OPT). It introduces Linear Optimal Partial Transport (), a linearized embedding of measures into a Euclidean tangent space built from OPT's dynamic formulation and barycentric projections, enabling efficient computation of OPT-based similarities. The paper develops both continuous and discrete formulations, defines the embedding and a corresponding discrepancy, and constructs an OPT interpolation analogous to LOT geodesics. It demonstrates the approach on tasks such as fast OPT distance approximation, point-cloud interpolation, and PCA analysis, reporting improved robustness to noise and substantial computational savings over exact OPT while preserving transport structure. These results suggest practical impact for large-scale, mass-variable measure comparisons and related data-analysis tasks.

Abstract

Optimal transport (OT) has gained popularity due to its various applications in fields such as machine learning, statistics, and signal processing. However, the balanced mass requirement limits its performance in practical problems. To address these limitations, variants of the OT problem, including unbalanced OT, Optimal partial transport (OPT), and Hellinger Kantorovich (HK), have been proposed. In this paper, we propose the Linear optimal partial transport (LOPT) embedding, which extends the (local) linearization technique on OT and HK to the OPT problem. The proposed embedding allows for faster computation of OPT distance between pairs of positive measures. Besides our theoretical contributions, we demonstrate the LOPT embedding technique in point-cloud interpolation and PCA analysis.
Paper Structure (31 sections, 12 theorems, 96 equations, 12 figures)

This paper contains 31 sections, 12 theorems, 96 equations, 12 figures.

Key Result

Lemma 2.1

Let $\mu^0$ and $\mu^j$ be two discrete probability measures as in eq: discrete measures, and consider an OT barycentric projection $\hat{\mu}^j$ of $\mu^j$ with respect to $\mu^0$ as in eq: discrete barycenter. Then, the map $x_n^0\mapsto \hat{x}_n^j$ given by eq: discrete barycenter solves the OT

Figures (12)

  • Figure 1: The depiction of the HK and OPT geodesics between two measures, at times $t\in\{0,0.25,0.5,0.75,1\}$. The top row (Blue) represents two initial deltas of mass one located at positions -1.2 and -1. The bottom row (Purple) shows two final deltas of mass one located at 1 and 1.2. At intermediate time steps $t=0.25,0.5,0.75$, the transported part (middle delta moving from -1 to 1) changes mass for HK while its mass remains constant for OPT. Outer masses (located at -1.2 for initial time $t=0$, and at 1.2 for final time $t=1$) are being destroyed and created, so mass changes are expected. Notably, mass is created/destroyed with a linear rate for OPT and a nonlinear rate for HK. See Appendix \ref{['sec: HK vs OPT']} for further analysis.
  • Figure 2: Graphs of the mean and median relative errors between $OPT_\lambda$ and $LOPT_{\lambda,\mu_0}$ as a function of the parameter $\lambda$.
  • Figure 3: Wall-clock time between OPT and LOPT. The LP solver in PythonOT flamary2021pot is applied to each individual OPT problem, with $100N$ maximum number of iterations.
  • Figure 4: We demonstrate the OT geodesic, OPT interpolation, LOT geodesic and LOPT interpolation in https://www.kaggle.com/datasets/cristiangarcia/pointcloudmnist2d dataset. In LOT geodesic and LOPT interpolation, we use the same reference measure. The percentage of noise $\eta$ is set to $0.5$. In OPT and LOPT interpolation, we set $\lambda=20$; in HK and LHK, we set the scaling to be $2.5$.
  • Figure 5: We plot the first two principal components of each $u^j$ based on LOT and LOPT. For LOPT, we set $\lambda=20.0$, and for LHK, we set the scaling to be $2.5$.
  • ...and 7 more figures

Theorems & Definitions (27)

  • Lemma 2.1
  • Proposition 2.2
  • Proposition 3.1
  • Definition 3.2
  • Definition 3.3
  • Definition 3.4
  • Theorem 3.5
  • Theorem 3.6
  • Proposition 3.7
  • Definition 3.8
  • ...and 17 more