Manifold Learning with Sparse Regularised Optimal Transport
Stephen Zhang, Gilles Mordant, Tetsuya Matsumoto, Geoffrey Schiebinger
TL;DR
The paper develops a manifold learning framework based on a symmetric, quadratically regularised optimal transport (QOT) projection to form a sparse, adaptive affinity matrix that respects latent geometry. It proves that the induced discrete operator converges to a Laplace-type operator, remains robust to heteroskedastic ambient noise, and exhibits a motivating link to nonlinear diffusion via the porous medium equation. Theoretical contributions include finite-sample dual-potential rates, robustness bounds, and convergence results, supplemented by an efficient symmetric semi-smooth Newton solver and an active-set variant for large datasets. Empirically, the method demonstrates superior resilience to noise and competitive performance across manifolds, spectral clustering, MNIST, and single-cell RNA-seq data, outperforming traditional kNN- or entropic-based approaches. This work offers a scalable, geometry-preserving approach to diffusion-based manifold learning with practical implications for high-dimensional data analysis where noise and sampling heterogeneity are prevalent.
Abstract
Manifold learning is a central task in modern statistics and data science. Many datasets (cells, documents, images, molecules) can be represented as point clouds embedded in a high dimensional ambient space, however the degrees of freedom intrinsic to the data are usually far fewer than the number of ambient dimensions. The task of detecting a latent manifold along which the data are embedded is a prerequisite for a wide family of downstream analyses. Real-world datasets are subject to noisy observations and sampling, so that distilling information about the underlying manifold is a major challenge. We propose a method for manifold learning that utilises a symmetric version of optimal transport with a quadratic regularisation that constructs a sparse and adaptive affinity matrix, that can be interpreted as a generalisation of the bistochastic kernel normalisation. We prove that the resulting kernel is consistent with a Laplace-type operator in the continuous limit, establish robustness to heteroskedastic noise and exhibit these results in numerical experiments. We identify a highly efficient computational scheme for computing this optimal transport for discrete data and demonstrate that it outperforms competing methods in a set of examples.
