Metric Flow Matching for Smooth Interpolations on the Data Manifold
Kacper Kapuśniak, Peter Potaptchik, Teodora Reu, Leo Zhang, Alexander Tong, Michael Bronstein, Avishek Joey Bose, Francesco Di Giovanni
TL;DR
This work tackles trajectory inference from cross-sectional data by adopting a data-aware geometry: interpolants are learned as geodesics of a data-dependent metric $g$, ensuring that probability paths $p_t$ stay near the data manifold rather than following Euclidean straight lines. Metric Flow Matching (MFM) generalizes Conditional Flow Matching by first learning geodesic-compatible interpolants $x_{t,\eta}$ that minimize a geodesic energy ${\mathcal E}_g$ and then regressing the vector field $v_{t,\theta}$ under the metric-induced norm $\|\cdot\|_g$, yielding more meaningful reconstructions of the underlying dynamics. The framework introduces concrete, task-agnostic metrics (LAND and RBF) and an Optimal Transport-based variant (OT-MFM) to couple marginals, demonstrating strong performance on LiDAR, unpaired image translation, and especially single-cell trajectory inference where it achieves SOTA results. Overall, MFM provides a simulation-free, geometry-aware approach to trajectory modeling that can adapt to curved data manifolds and broad downstream tasks, reducing uncertainty and improving interpretability of interpolations.
Abstract
Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.
