Table of Contents
Fetching ...

Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data

Tianyu Li, Yihan Li, Zizhe Zhang, Nadia Figueroa

TL;DR

This work tackles compliance in contact-rich robotic manipulation by generating force-informed simulation data from a single human demonstration and learning a 3D flow-matching visuomotor policy that uses point cloud and force inputs. The policy outputs a reference pose trajectory, a force-guiding virtual target, and an impedance parameter, which are rolled out as a state-velocity field and executed via a Passive Impedance Controller to ensure safe, compliant interactions. Key innovations include Laplacian editing-based trajectory warping for data augmentation, a force-aware data generation pipeline, and a zero-shot transfer to real Franka robots for block flipping and bi-manual moving. Experiments show high real-world success rates, reduced energy injection, and robust spatial/object generalization, illustrating the practical impact of combining lightweight sim data generation with compliant, vision-driven control.

Abstract

While visuomotor policy has made advancements in recent years, contact-rich tasks still remain a challenge. Robotic manipulation tasks that require continuous contact demand explicit handling of compliance and force. However, most visuomotor policies ignore compliance, overlooking the importance of physical interaction with the real world, often leading to excessive contact forces or fragile behavior under uncertainty. Introducing force information into vision-based imitation learning could help improve awareness of contacts, but could also require a lot of data to perform well. One remedy for data scarcity is to generate data in simulation, yet computationally taxing processes are required to generate data good enough not to suffer from the Sim2Real gap. In this work, we introduce a framework for generating force-informed data in simulation, instantiated by a single human demonstration, and show how coupling with a compliant policy improves the performance of a visuomotor policy learned from synthetic data. We validate our approach on real-robot tasks, including non-prehensile block flipping and a bi-manual object moving, where the learned policy exhibits reliable contact maintenance and adaptation to novel conditions. Project Website: https://flow-with-the-force-field.github.io/webpage/

Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data

TL;DR

This work tackles compliance in contact-rich robotic manipulation by generating force-informed simulation data from a single human demonstration and learning a 3D flow-matching visuomotor policy that uses point cloud and force inputs. The policy outputs a reference pose trajectory, a force-guiding virtual target, and an impedance parameter, which are rolled out as a state-velocity field and executed via a Passive Impedance Controller to ensure safe, compliant interactions. Key innovations include Laplacian editing-based trajectory warping for data augmentation, a force-aware data generation pipeline, and a zero-shot transfer to real Franka robots for block flipping and bi-manual moving. Experiments show high real-world success rates, reduced energy injection, and robust spatial/object generalization, illustrating the practical impact of combining lightweight sim data generation with compliant, vision-driven control.

Abstract

While visuomotor policy has made advancements in recent years, contact-rich tasks still remain a challenge. Robotic manipulation tasks that require continuous contact demand explicit handling of compliance and force. However, most visuomotor policies ignore compliance, overlooking the importance of physical interaction with the real world, often leading to excessive contact forces or fragile behavior under uncertainty. Introducing force information into vision-based imitation learning could help improve awareness of contacts, but could also require a lot of data to perform well. One remedy for data scarcity is to generate data in simulation, yet computationally taxing processes are required to generate data good enough not to suffer from the Sim2Real gap. In this work, we introduce a framework for generating force-informed data in simulation, instantiated by a single human demonstration, and show how coupling with a compliant policy improves the performance of a visuomotor policy learned from synthetic data. We validate our approach on real-robot tasks, including non-prehensile block flipping and a bi-manual object moving, where the learned policy exhibits reliable contact maintenance and adaptation to novel conditions. Project Website: https://flow-with-the-force-field.github.io/webpage/

Paper Structure

This paper contains 15 sections, 12 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Given a single demonstration in simulation, we generate force and demonstration-guided data and transfer to real-world compliant visuomotor policy deployment.
  • Figure 2: Flow with the Force Field 3D Compliant Visuomotor Policy Learning Framework: Starting from a single simulation demonstration, we augment data by adding force-informed virtual targets and applying Laplacian editing to generate point cloud and force trajectories beyond the original demo. We train a flow-matching policy that takes point cloud and force as inputs and predicts actions, including an impedance parameter. At rollout, the policy is synthesized into a state-velocity field and executed with a Passive Impedance Controller for compliant behavior. While our data is only generated with one simple box geometry for both tasks, the trained policies produce generalizable capabilities beyond the single shapes using our framework.
  • Figure 3: Example of trajectory warping with different modulation strategies. (a) is the demonstration trajectory, (b) shows the force-informed trajectory with virtual target, and (c) shows the trajectory warped by object and end effector initial pose, while (d) shows the complete trajectory we use in data generation warped both by object and end effector initial pose and force.
  • Figure 4: The red cubes show the location for spatial performance evaluation on (left) Block Flipping and (right) Bi-manual Moving.
  • Figure 5: Real-world object and spatial generalization results.
  • ...and 1 more figures