Table of Contents
Fetching ...

Metric Flow Matching for Smooth Interpolations on the Data Manifold

Kacper Kapuśniak, Peter Potaptchik, Teodora Reu, Leo Zhang, Alexander Tong, Michael Bronstein, Avishek Joey Bose, Francesco Di Giovanni

TL;DR

This work tackles trajectory inference from cross-sectional data by adopting a data-aware geometry: interpolants are learned as geodesics of a data-dependent metric $g$, ensuring that probability paths $p_t$ stay near the data manifold rather than following Euclidean straight lines. Metric Flow Matching (MFM) generalizes Conditional Flow Matching by first learning geodesic-compatible interpolants $x_{t,\eta}$ that minimize a geodesic energy ${\mathcal E}_g$ and then regressing the vector field $v_{t,\theta}$ under the metric-induced norm $\|\cdot\|_g$, yielding more meaningful reconstructions of the underlying dynamics. The framework introduces concrete, task-agnostic metrics (LAND and RBF) and an Optimal Transport-based variant (OT-MFM) to couple marginals, demonstrating strong performance on LiDAR, unpaired image translation, and especially single-cell trajectory inference where it achieves SOTA results. Overall, MFM provides a simulation-free, geometry-aware approach to trajectory modeling that can adapt to curved data manifolds and broad downstream tasks, reducing uncertainty and improving interpretability of interpolations.

Abstract

Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.

Metric Flow Matching for Smooth Interpolations on the Data Manifold

TL;DR

This work tackles trajectory inference from cross-sectional data by adopting a data-aware geometry: interpolants are learned as geodesics of a data-dependent metric , ensuring that probability paths stay near the data manifold rather than following Euclidean straight lines. Metric Flow Matching (MFM) generalizes Conditional Flow Matching by first learning geodesic-compatible interpolants that minimize a geodesic energy and then regressing the vector field under the metric-induced norm , yielding more meaningful reconstructions of the underlying dynamics. The framework introduces concrete, task-agnostic metrics (LAND and RBF) and an Optimal Transport-based variant (OT-MFM) to couple marginals, demonstrating strong performance on LiDAR, unpaired image translation, and especially single-cell trajectory inference where it achieves SOTA results. Overall, MFM provides a simulation-free, geometry-aware approach to trajectory modeling that can adapt to curved data manifolds and broad downstream tasks, reducing uncertainty and improving interpretability of interpolations.

Abstract

Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.
Paper Structure (38 sections, 2 theorems, 38 equations, 7 figures, 8 tables, 2 algorithms)

This paper contains 38 sections, 2 theorems, 38 equations, 7 figures, 8 tables, 2 algorithms.

Key Result

Proposition 1

Given a dataset ${\mathcal{D}}\subset \mathbb{R}^d$, let $g$ be any metric such that: (i) The eigenvalues of $\mathbf{G}(x;{\mathcal{D}})$ do not approach zero when $x$ is distant from ${\mathcal{D}}$; (ii) $\|\mathbf{G}(x;{\mathcal{D}})\|$ is sufficiently small if $x$ is close to ${\mathcal{D}}$. T

Figures (7)

  • Figure 1: In orange and violet the source and target distributions. On the left, straight interpolations vs interpolations following a data-dependent Riemannian metric. On the right, densities of reconstructed marginals at time $t=\frac{1}{2}$, using Conditional Flow Matching and Metric Flow Matching (MFM), respectively. MFM provides a more meaningful reconstruction supported on the data manifold.
  • Figure 1: Wasserstein distance between reconstructed marginal at time $1/2$ and ground-truth.
  • Figure 2: Interpolants for the Arch dataset (top row), Sphere dataset (middle row) and over LiDAR scans of Mt. Rainier (bottom row). In all cases, learning interpolants that minimize the LAND metric \ref{['eq:metric_formulation_LAND']} leads to more meaningful matchings.
  • Figure 3: Qualitative comparison for image translation. By designing interpolants on the data manifold, $\text{OT-MFM}_{\hbox{$\mathrm{RBF}$}}$ better preserves input features.
  • Figure 4: Additional qualitative comparison for the task of unpaired translation between OT-CFM and $\text{OT-MFM}_{\hbox{$\mathrm{RBF}$}}$.
  • ...and 2 more figures

Theorems & Definitions (4)

  • Definition 1
  • Proposition 1: name=Informal
  • Theorem B.1: name=Formal statement of \ref{['prop:informal']}
  • proof : Proof of \ref{['prop:geodesics']}