Differentiable Cost-Parameterized Monge Map Estimators

Samuel Howard; George Deligiannidis; Patrick Rebeschini; James Thornton

Differentiable Cost-Parameterized Monge Map Estimators

Samuel Howard, George Deligiannidis, Patrick Rebeschini, James Thornton

TL;DR

This work addresses learning optimal transport maps when problem-specific ground costs are unknown or suboptimal by introducing a differentiable Monge map estimator that jointly learns a convex cost and the OT map. The method parameterizes the cost with an Input Convex Neural Network and uses an entropic OT-based mapping estimator, enabling end-to-end differentiation through Sinkhorn and the inverse of the convex gradient, with optional augmentation via diffeomorphisms to incorporate prior structure. It demonstrates that optimizing the Monge map directly yields better alignment with known pairings and trajectories (e.g., Live-seq data) while producing interpretable, low-ambiguity cost functions, and it supports learning from partial information such as labeled pairs. The approach has practical impact for tailored OT in applications like trajectory inference and generative modelling, where problem-specific costs and direct map estimators improve robustness and interpretability.

Abstract

Within the field of optimal transport (OT), the choice of ground cost is crucial to ensuring that the optimality of a transport map corresponds to usefulness in real-world applications. It is therefore desirable to use known information to tailor cost functions and hence learn OT maps which are adapted to the problem at hand. By considering a class of neural ground costs whose Monge maps have a known form, we construct a differentiable Monge map estimator which can be optimized to be consistent with known information about an OT map. In doing so, we simultaneously learn both an OT map estimator and a corresponding adapted cost function. Through suitable choices of loss function, our method provides a general approach for incorporating prior information about the Monge map itself when learning adapted OT maps and cost functions.

Differentiable Cost-Parameterized Monge Map Estimators

TL;DR

Abstract

Paper Structure (31 sections, 5 theorems, 42 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 31 sections, 5 theorems, 42 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Background on Optimal Transport
A Differentiable Monge Map Estimator
Differentiable, Cost-Parameterized Transport Maps
Augmenting with Diffeomorphisms
Experiments
Conclusion
Method
Cost function parameterization
Convex function $h$
Diffeomorphisms $\Phi_\mu, \Phi_\nu$
Choice of Loss Function
Labelled datapoints
Properties of Map Displacements
Reverse map estimators
...and 16 more sections

Key Result

Theorem 2.1

For measures $\mu$ and $\nu$ on a compact domain $\Omega \subset {\mathbb R}^d$ and a cost of the form $c(x,y) = h(x-y)$ for a strictly convex function $h$, there exists an optimal plan $\pi^\star$ for the Kantorovich problem. If $\mu$ is absolutely continuous and $\partial \Omega$ is negligible, th

Figures (4)

Figure 1: Using the squared-Euclidean cost can result in incorrect map estimators that do not align with known ground-truth mappings. By optimizing a differentiable cost-parameterized Monge map estimator to resemble known information, we can obtain Monge map estimators and corresponding cost functions which are consistent with the ground-truth.
Figure 2: (left) By optimizing the map directly, we learn a Monge map estimator which agrees with the Live-seq trajectories. (middle) Optimizing according to the coupling matrix with a cost $h(\Phi(x)-\Phi(y))$ transports some points correctly, but struggles to learn a good cost overall (see Table \ref{['tab:Live seq']}). (right) For a general MLP cost, it is difficult to obtain a good mapping estimator.
Figure 3: Results for the synthetic limited labelled pairs experiment. Optimizing the map estimator results in a mapping that aligns significantly better with the known paired points, and also demonstrates improved prediction for out-of-sample points. The learned cost appears comparable to those learned by optimizing the coupling.
Figure 4: (middle) The learned low-rank mapping exhibits displacements primarily along a 2-dimensional plane, as demonstrated by the small final singular value. (right) The displacements of the learned 2-directional mapping occur primarily along the displayed directions, which were learned during training.

Theorems & Definitions (7)

Theorem 2.1: Theorem 1.17, santambrogio2015optimal
Proposition 3.0
Theorem 3.1
Proposition 5.0
proof
Theorem 5.1
proof

Differentiable Cost-Parameterized Monge Map Estimators

TL;DR

Abstract

Differentiable Cost-Parameterized Monge Map Estimators

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (7)