Linear-Space Extragradient Methods for Fast, Large-Scale Optimal Transport

Matthew X. Burns; Jiaming Liang

Linear-Space Extragradient Methods for Fast, Large-Scale Optimal Transport

Matthew X. Burns, Jiaming Liang

TL;DR

This work tackles large-scale optimal transport under stringent memory constraints by introducing DXG, a dual-only extragradient method that operates in $O(n)$ memory while achieving near-state-of-the-art iteration complexity for OT. By establishing a dual-structured saddle-point formulation and proving equivalences with entropy-regularized OT, the authors extend the approach to Wasserstein barycenters and provide a CUDA-accelerated implementation. They show how the dual framework can recover primal iterates with linear-space storage and present convergence guarantees with non-asymptotic iteration complexity bounds, including a general parameter regime yielding $\mathcal{O}(\sqrt{n}\,\eta^{-1}\log\frac{n}{\varepsilon})$ iterations. Empirical results demonstrate DXG’s strong performance in weakly regularized EOT and $\ell_1$-cost settings, while also identifying limitations in barycenter tasks and outlining future directions for improvement.

Abstract

Optimal transport (OT) and its entropy-regularized form (EOT) have become increasingly prominent computational problems, with applications in machine learning and statistics. Recent years have seen a commensurate surge in first-order methods aiming to improve the complexity of large-scale (E)OT. However, there has been a consistent tradeoff: attaining state-of-the-art rates requires $\mathcal{O}(n^2)$ storage to enable ergodic primal averaging. In this work, we demonstrate that recently proposed primal-dual extragradient methods (PDXG) can be implemented entirely in the dual with $\mathcal{O}(n)$ storage. Additionally, we prove that regularizing the reformulated OT problem is equivalent to EOT with extensions to entropy-regularized barycenter problems, further widening the applications of the proposed method. The proposed dual-only extragradient method (DXG) achieves $\mathcal{O}(n^2\varepsilon^{-1})$ complexity for $\varepsilon$-approximate OT with $\mathcal{O}(n)$ memory. Numerical experiments demonstrate that the dual extragradient method scales favorably in non/weakly-regularized regimes compared to existing algorithms, though future work is needed to improve performance in certain problem classes.

Linear-Space Extragradient Methods for Fast, Large-Scale Optimal Transport

TL;DR

This work tackles large-scale optimal transport under stringent memory constraints by introducing DXG, a dual-only extragradient method that operates in

memory while achieving near-state-of-the-art iteration complexity for OT. By establishing a dual-structured saddle-point formulation and proving equivalences with entropy-regularized OT, the authors extend the approach to Wasserstein barycenters and provide a CUDA-accelerated implementation. They show how the dual framework can recover primal iterates with linear-space storage and present convergence guarantees with non-asymptotic iteration complexity bounds, including a general parameter regime yielding

iterations. Empirical results demonstrate DXG’s strong performance in weakly regularized EOT and

-cost settings, while also identifying limitations in barycenter tasks and outlining future directions for improvement.

Abstract

storage to enable ergodic primal averaging. In this work, we demonstrate that recently proposed primal-dual extragradient methods (PDXG) can be implemented entirely in the dual with

storage. Additionally, we prove that regularizing the reformulated OT problem is equivalent to EOT with extensions to entropy-regularized barycenter problems, further widening the applications of the proposed method. The proposed dual-only extragradient method (DXG) achieves

complexity for

-approximate OT with

memory. Numerical experiments demonstrate that the dual extragradient method scales favorably in non/weakly-regularized regimes compared to existing algorithms, though future work is needed to improve performance in certain problem classes.

Linear-Space Extragradient Methods for Fast, Large-Scale Optimal Transport

TL;DR

Abstract

Linear-Space Extragradient Methods for Fast, Large-Scale Optimal Transport

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (48)