Displacement-Sparse Neural Optimal Transport
Peter Chen, Yue Xie, Qingpeng Zhang
TL;DR
This work addresses the interpretability gap in neural optimal transport by learning displacement-sparse maps within neural OT solvers. It introduces a biased minimax formulation with a general sparsity penalty, enabled by ICNNs, and a novel smoothed $oldsymbol{ au_0}$ regularizer that supports non-proximal penalties and general elastic costs. The authors prove theoretical guarantees as the sparsity strength $oldsymbol{oldlambda}$ vanishes and design an adaptive, simulated-annealing-based control to balance sparsity and feasibility in high-dimensional settings. Empirically, the method improves interpretability and downstream utility on synthetic sc-RNA perturbations and real 4i perturbation data, outperforming both exact OT and $oldsymbol{ au_{L1}}$ baselines in terms of dimensionality control and gene overlap. Overall, the approach yields more interpretable, low-dimensional transport maps suitable for large-scale biological analyses while remaining scalable to high-dimensional data.
Abstract
Optimal transport (OT) aims to find a map $T$ that transports mass from one probability measure to another while minimizing a cost function. Recently, neural OT solvers have gained popularity in high dimensional biological applications such as drug perturbation, due to their superior computational and memory efficiency compared to traditional exact Sinkhorn solvers. However, the overly complex high dimensional maps learned by neural OT solvers often suffer from poor interpretability. Prior work addressed this issue in the context of exact OT solvers by introducing \emph{displacement-sparse maps} via designed elastic cost, but such method failed to be applied to neural OT settings. In this work, we propose an intuitive and theoretically grounded approach to learning \emph{displacement-sparse maps} within neural OT solvers. Building on our new formulation, we introduce a novel smoothed $\ell_0$ regularizer that outperforms the $\ell_1$ based alternative from prior work. Leveraging Input Convex Neural Network's flexibility, we further develop a heuristic framework for adaptively controlling sparsity intensity, an approach uniquely enabled by the neural OT paradigm. We demonstrate the necessity of this adaptive framework in large-scale, high-dimensional training, showing not only improved accuracy but also practical ease of use for downstream applications.
