Table of Contents
Fetching ...

Laplace Learning in Wasserstein Space

Mary Chriselda Antony Oliver, Michael Roberts, Carola-Bibiane Schönlieb, Matthew Thorpe

TL;DR

This work extends Laplace Learning to a Wasserstein submanifold of probability measures, grounding semi-supervised classification in an infinite-dimensional geometric setting. By proving Gamma-convergence of the discrete graph $p$-Dirichlet energy to a continuum energy and characterising the Laplace–Beltrami operator on the Wasserstein submanifold, the authors establish a rigorous link between graph-based learning and PDE-based diffusion in this context. The framework employs TL$^p$ topology and optimal transport tools to handle discretization and measure-valued data, with compactness results guaranteeing the existence of continuum minimisers. Numerical experiments on synthetic Gaussian data and ModelNet10 demonstrate robustness and consistency of classification as the data size grows and labels remain scarce. Overall, the paper provides a principled variational foundation for Laplace Learning in high-dimensional measure-embedded spaces, with potential extensions to unbalanced or alternative transport geometries.

Abstract

The manifold hypothesis posits that high-dimensional data typically resides on low-dimensional sub spaces. In this paper, we assume manifold hypothesis to investigate graph-based semi-supervised learning methods. In particular, we examine Laplace Learning in the Wasserstein space, extending the classical notion of graph-based semi-supervised learning algorithms from finite-dimensional Euclidean spaces to an infinite-dimensional setting. To achieve this, we prove variational convergence of a discrete graph p- Dirichlet energy to its continuum counterpart. In addition, we characterize the Laplace-Beltrami operator on asubmanifold of the Wasserstein space. Finally, we validate the proposed theoretical framework through numerical experiments conducted on benchmark datasets, demonstrating the consistency of our classification performance in high-dimensional settings.

Laplace Learning in Wasserstein Space

TL;DR

This work extends Laplace Learning to a Wasserstein submanifold of probability measures, grounding semi-supervised classification in an infinite-dimensional geometric setting. By proving Gamma-convergence of the discrete graph -Dirichlet energy to a continuum energy and characterising the Laplace–Beltrami operator on the Wasserstein submanifold, the authors establish a rigorous link between graph-based learning and PDE-based diffusion in this context. The framework employs TL topology and optimal transport tools to handle discretization and measure-valued data, with compactness results guaranteeing the existence of continuum minimisers. Numerical experiments on synthetic Gaussian data and ModelNet10 demonstrate robustness and consistency of classification as the data size grows and labels remain scarce. Overall, the paper provides a principled variational foundation for Laplace Learning in high-dimensional measure-embedded spaces, with potential extensions to unbalanced or alternative transport geometries.

Abstract

The manifold hypothesis posits that high-dimensional data typically resides on low-dimensional sub spaces. In this paper, we assume manifold hypothesis to investigate graph-based semi-supervised learning methods. In particular, we examine Laplace Learning in the Wasserstein space, extending the classical notion of graph-based semi-supervised learning algorithms from finite-dimensional Euclidean spaces to an infinite-dimensional setting. To achieve this, we prove variational convergence of a discrete graph p- Dirichlet energy to its continuum counterpart. In addition, we characterize the Laplace-Beltrami operator on asubmanifold of the Wasserstein space. Finally, we validate the proposed theoretical framework through numerical experiments conducted on benchmark datasets, demonstrating the consistency of our classification performance in high-dimensional settings.

Paper Structure

This paper contains 34 sections, 20 theorems, 139 equations, 5 figures.

Key Result

Theorem 3.1

Let $\Omega \subset \mathbb{R}^k$ be an open, connected and bounded domain with Lipschitz boundary with $k \geq 1$. Let $\mu$ be a probability measure on $\Omega$ with density $\rho_\mu : \Omega \to (0, \infty)$ such that there exists $C > c \geq 1$ for which, Let $\alpha>2$, $p\in \{2,\infty\}$, and $S^{*}_m$ be an optimal transport map between $\mu$ and $\mu^{(m)}$ with respect to the $p$-Wasse

Figures (5)

  • Figure 1: A schematic diagram outlining the analytical framework of the study. The primary objective is to demonstrate the $\Gamma$-convergence of $\mathcal{E}_{\varepsilon,m,n}(\cdot)$ to $\mathcal{E}_\infty(\cdot)$ as $n \to \infty$, $m \to \infty$, and $\varepsilon \to 0$, with the latter two limits approaching at a suitable rate. Whilst our main result is to establish variational convergence from (b.) to (c.) our proof follows the sequence (b. $\rightarrow$ a. $\rightarrow$ d. $\rightarrow$ e. $\rightarrow$ c.). For clarity, the feature vectors depicted in the top row of the figure represent probability measures.
  • Figure 2: The synthetic dataset comprising 800 Gaussian samples (left, each depicted by different colours), along with the corresponding density plots (right).
  • Figure 3: Accuracy (%) plotted against the number of samples (log scale) in Experiment 1 for varying training label rates (20%, 40%, 60%), represented by coloured lines with markers. The shaded regions around each line indicate the 95% confidence intervals computed from the standard deviations across 100 runs. The secondary y-axis (purple) shows the corresponding values of the connectivity threshold ($\varepsilon_n$) across sample sizes. Accuracy increases with both the number of samples and training label rates, while $\varepsilon_n$ decreases as sample size grows.
  • Figure 4: ModelNet10 dataset.
  • Figure 5: a. Mean classification accuracy on the ModelNet10 dataset using the linear Wasserstein distance with $k$-nearest-neighbour graphs for $k=15,20,25$. Results are averaged over 100 iterations for each training label rate (20–80%), with 95% confidence intervals shown as shaded regions around the lines. For comparison, the supervised PointNet model Qi was trained for 100 epochs using Adam optimizer (learning rate 0.001), batch size 32, and cross-entropy loss. b. Confusion matrix showing classification accuracy (in %) for different class labels at 80% training label rate.

Theorems & Definitions (44)

  • Remark 2.1
  • Theorem 3.1
  • proof
  • Remark 3.2
  • Definition 3.3
  • Proposition 3.4
  • proof
  • Remark 3.5
  • Theorem 3.6
  • Definition 3.7
  • ...and 34 more