Linearized Optimal Transport pyLOT Library: A Toolkit for Machine Learning on Point Clouds
Jun Linwu, Varun Khurana, Nicholas Karris, Alexander Cloninger
TL;DR
This work introduces pyLOT, a Python toolkit for Linearized Optimal Transport that embeds measure-valued data into a Hilbert space $L^2(\sigma)$ via the transport map $T_\sigma^{\mu}$ from a reference $\sigma$, enabling standard linear ML on point clouds. It details the LOT framework, discusses theoretical underpinnings and convergence properties, and demonstrates practical efficacy on high-resolution 3D tooth scans, including dimensionality reduction, classification, and fast LOT barycenter generation. The results show that LOT embeddings offer competitive classification performance with substantial computational savings compared to traditional Wasserstein barycenters, and iterative reference updates further improve barycenter quality. Overall, pyLOT provides a scalable, versatile toolkit for processing and analyzing measure-valued data in ML pipelines, with potential extensions to time-series and broader pre-processing stages.
Abstract
The pyLOT library offers a Python implementation of linearized optimal transport (LOT) techniques and methods to use in downstream tasks. The pipeline embeds probability distributions into a Hilbert space via the Optimal Transport maps from a fixed reference distribution, and this linearization allows downstream tasks to be completed using off the shelf (linear) machine learning algorithms. We provide a case study of performing ML on 3D scans of lemur teeth, where the original questions of classification, clustering, dimension reduction, and data generation reduce to simple linear operations performed on the LOT embedded representations.
