Table of Contents
Fetching ...

Riemannian Flow Matching Policy for Robot Motion Learning

Max Braun, Noémie Jaquier, Leonel Rozo, Tamim Asfour

TL;DR

The paper addresses learning robot visuomotor policies when states lie on Riemannian manifolds by introducing RFMP, a Riemannian extension of Flow Matching that uses a geodesic-informed vector field to transport a base distribution toward demonstrations. RFMP conditions the learned vector field on observations, employs a receding horizon for temporal consistency, and leverages tractable ODE-based inference, yielding smoother trajectories and faster runtime than diffusion-based baselines. It provides detailed implementation on simple MLP architectures and evaluates on LASA datasets in both trajectory-based and visuomotor settings, including Euclidean and spherical manifolds. The results show RFMP achieving competitive or superior smoothness and notably lower inference times, while preserving intrinsic manifold geometry, highlighting its potential for real-time robotic control with multimodal demonstrations.

Abstract

We introduce Riemannian Flow Matching Policies (RFMP), a novel model for learning and synthesizing robot visuomotor policies. RFMP leverages the efficient training and inference capabilities of flow matching methods. By design, RFMP inherits the strengths of flow matching: the ability to encode high-dimensional multimodal distributions, commonly encountered in robotic tasks, and a very simple and fast inference process. We demonstrate the applicability of RFMP to both state-based and vision-conditioned robot motion policies. Notably, as the robot state resides on a Riemannian manifold, RFMP inherently incorporates geometric awareness, which is crucial for realistic robotic tasks. To evaluate RFMP, we conduct two proof-of-concept experiments, comparing its performance against Diffusion Policies. Although both approaches successfully learn the considered tasks, our results show that RFMP provides smoother action trajectories with significantly lower inference times.

Riemannian Flow Matching Policy for Robot Motion Learning

TL;DR

The paper addresses learning robot visuomotor policies when states lie on Riemannian manifolds by introducing RFMP, a Riemannian extension of Flow Matching that uses a geodesic-informed vector field to transport a base distribution toward demonstrations. RFMP conditions the learned vector field on observations, employs a receding horizon for temporal consistency, and leverages tractable ODE-based inference, yielding smoother trajectories and faster runtime than diffusion-based baselines. It provides detailed implementation on simple MLP architectures and evaluates on LASA datasets in both trajectory-based and visuomotor settings, including Euclidean and spherical manifolds. The results show RFMP achieving competitive or superior smoothness and notably lower inference times, while preserving intrinsic manifold geometry, highlighting its potential for real-time robotic control with multimodal demonstrations.

Abstract

We introduce Riemannian Flow Matching Policies (RFMP), a novel model for learning and synthesizing robot visuomotor policies. RFMP leverages the efficient training and inference capabilities of flow matching methods. By design, RFMP inherits the strengths of flow matching: the ability to encode high-dimensional multimodal distributions, commonly encountered in robotic tasks, and a very simple and fast inference process. We demonstrate the applicability of RFMP to both state-based and vision-conditioned robot motion policies. Notably, as the robot state resides on a Riemannian manifold, RFMP inherently incorporates geometric awareness, which is crucial for realistic robotic tasks. To evaluate RFMP, we conduct two proof-of-concept experiments, comparing its performance against Diffusion Policies. Although both approaches successfully learn the considered tasks, our results show that RFMP provides smoother action trajectories with significantly lower inference times.
Paper Structure (12 sections, 7 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 12 sections, 7 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Learned RFMP flows () from the base distribution () to the LASA datasets $\mathsf{S}$ and $\mathsf{W}$ () on both $\mathbb{R}^2$ (top) and $\mathcal{S}^{2}$ (bottom). The flow is conditioned on random observations $\bm{o}$ from the training dataset ().
  • Figure 2: Demonstrations () and learned trajectories on the LASA datasets $\mathsf{S}$ in $\mathbb{R}^2$ (left), on the LASA datasets $\mathsf{S}$, $\mathsf{W}$ projected on $\mathcal{S}^{2}$ (middle-left, middle-right), and on a multimodal dataset made of mirrored datasets of the letter $\mathsf{L}$ projected on $\mathcal{S}^{2}$ (right). Reproductions start at the same initial observations as the demonstrations (), or from randomly-sampled observations in the demonstration dataset neighborhood (). Trajectory starts are depicted by dots in the multimodal case.
  • Figure 3: Demonstrations () and learned trajectories on the LASA datasets $\mathsf{S}$ and $\mathsf{W}$ with different prediction horizons $T_a=\{2,4,8\}$ (from left to right). Reproductions start at the same initial observations as the demonstrations (), or from randomly-sampled observations in their neighborhood ().
  • Figure 4: Examples of visual observations at the end of a demonstration of the LASA dataset $\mathsf{S}$.
  • Figure 5: Demonstrations () and trajectories reproduced by the visuomotor RFMP and DP () on the LASA datasets $\mathsf{S}$ and $\mathsf{J}$ in $\mathbb{R}^2$ (left) and on the LASA datasets $\mathsf{S}$ and $\mathsf{W}$ projected on $\mathcal{S}^{2}$ (right).