Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

Abdelwahed Khamis; Russell Tsuchida; Mohamed Tarek; Vivien Rolland; Lars Petersson

Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

Abdelwahed Khamis, Russell Tsuchida, Mohamed Tarek, Vivien Rolland, Lars Petersson

TL;DR

A systematic analysis of the methods used in the literature for scaling OT and present the findings in a unified taxonomy to address the fundamental question of how to scale optimal transport to cope with the current demands of big and high dimensional data.

Abstract

Optimal Transport (OT) is a mathematical framework that first emerged in the eighteenth century and has led to a plethora of methods for answering many theoretical and applied questions. The last decade has been a witness to the remarkable contributions of this classical optimization problem to machine learning. This paper is about where and how optimal transport is used in machine learning with a focus on the question of scalable optimal transport. We provide a comprehensive survey of optimal transport while ensuring an accessible presentation as permitted by the nature of the topic and the context. First, we explain the optimal transport background and introduce different flavors (i.e., mathematical formulations), properties, and notable applications. We then address the fundamental question of how to scale optimal transport to cope with the current demands of big and high dimensional data. We conduct a systematic analysis of the methods used in the literature for scaling OT and present the findings in a unified taxonomy. We conclude with presenting some open challenges and discussing potential future research directions. A live repository of related OT research papers is maintained in https://github.com/abdelwahed/OT_for_big_data.git

Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

TL;DR

Abstract

Paper Structure (44 sections, 53 equations, 13 figures)

This paper contains 44 sections, 53 equations, 13 figures.

Introduction
A Motivational Example
Background
Notation
Monge Formulation (Optimal Map)
Kantorovich Formulation (Optimal Plan)
Common OT Formulations
Regularized OT
Unbalanced and Partial OT
Sliced OT
Wasserstein Barycenter
Gromov-Wasserstein (GW)
Scaling OT Computation
Measures Simplification
Structuring the Optimal Plan
...and 29 more sections

Figures (13)

Figure 1: Optimal Transport geometry awareness. Individual word-to-word distances may not capture the semantic similarity between two documents due to the lack of common words. OT leverages the underlying geometry (captured by the ground cost) and lifts it to the Wasserstein space where the distance represents the cost of the optimal transportation of a whole document (distribution) to another.
Figure 2: Survey Position. Among OT review papers chen2021optimalkolouri2017optimalzhang2021reviewtorres2021surveyhuynh2021optimal, this work presents a very comprehensive and updated coverage of OT in machine learning. The circles on the left column represent the existing reviews with their size indicating the number of the reviewed papers. The connections between the left and right columns depict the topics covered in each considered review article. The underlined "Scaling OT" is a key focus of this survey.
Figure 3: Outline of the Survey. We start by (\ref{['sec:motivational_example']} Motivational Example) introducing the reader to OT through a motivational example that paves for (\ref{['sec:nota_background']}, \ref{['sec:common_ot_formulations']} Background) the discussion of OT formulations. Next, (\ref{['sec:scaling_ot']} Scaling OT) we present a taxonomy for scaling OT methods to big data regimes. Then, open issues and future research directions are discussed in (\ref{['sec:future_directions']} Discussion). In the supplementary material, we extend the background discussion (\ref{['sec:ot_background']} Extended Background) and include a summary of OT applications in machine learning(\ref{['sec:ot_applictions']} Applications). All the figures are best viewed in color.
Figure 4: Conceptual Depiction of Optimal Transport:(a) The optimal transport problem is stated as follows. Given two points cloud objects $\mu$ and $\nu$ and the knowledge of point-wise distances (i.e. the ground distance), find a legitimate way (i.e. valid plan) to redistribute the mass of $\mu$ into $\nu$ in the least costly way $\bm{P}^*$. The optimal transport cost is then given by $d_\text{OT}(\mu,\nu) = \langle \bm{C}, \bm{P}^* \rangle$. (b) Matching of two point cloud objects using optimal transport. The point correspondences estimated by the optimal plan are shown as straight pink lines. OT literature is filled with formulations that extend this key concept to new situations and applications. For example, formulations that allow (c) partial transportation or (d) transportation between incomparable spaces.
Figure 5: $\text{Sinkhorn}(\bm{K},\bm{a},\bm{b},\delta)$
...and 8 more figures

Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

TL;DR

Abstract

Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (13)