Table of Contents
Fetching ...

FoAR: Force-Aware Reactive Policy for Contact-Rich Robotic Manipulation

Zihao He, Hongjie Fang, Jingjing Chen, Hao-Shu Fang, Cewu Lu

TL;DR

FoAR tackles the challenge of contact-rich robotic manipulation by integrating high-frequency force/torque sensing with vision through a future contact predictor that gates multimodal fusion. Built on the diffusion-based RISE framework, FoAR processes a sparse 3D point-cloud representation and a force/torque history to predict actions over a horizon, while reactive control refines commands using current force data. The key contribution is a fusion mechanism driven by predicted contact probability $\phi(t)$, enabling robust, force-aware manipulation across non-contact and contact phases with simple end-effector position control; experiments on wiping, peeling, and chopping demonstrate superior performance and robustness with only $50$ demonstrations per task. This approach offers a practical pathway to more dexterous, reliable manipulation in real-world settings by effectively leveraging force feedback in multimodal policies.

Abstract

Contact-rich tasks present significant challenges for robotic manipulation policies due to the complex dynamics of contact and the need for precise control. Vision-based policies often struggle with the skill required for such tasks, as they typically lack critical contact feedback modalities like force/torque information. To address this issue, we propose FoAR, a force-aware reactive policy that combines high-frequency force/torque sensing with visual inputs to enhance the performance in contact-rich manipulation. Built upon the RISE policy, FoAR incorporates a multimodal feature fusion mechanism guided by a future contact predictor, enabling dynamic adjustment of force/torque data usage between non-contact and contact phases. Its reactive control strategy also allows FoAR to accomplish contact-rich tasks accurately through simple position control. Experimental results demonstrate that FoAR significantly outperforms all baselines across various challenging contact-rich tasks while maintaining robust performance under unexpected dynamic disturbances. Project website: https://tonyfang.net/FoAR/

FoAR: Force-Aware Reactive Policy for Contact-Rich Robotic Manipulation

TL;DR

FoAR tackles the challenge of contact-rich robotic manipulation by integrating high-frequency force/torque sensing with vision through a future contact predictor that gates multimodal fusion. Built on the diffusion-based RISE framework, FoAR processes a sparse 3D point-cloud representation and a force/torque history to predict actions over a horizon, while reactive control refines commands using current force data. The key contribution is a fusion mechanism driven by predicted contact probability , enabling robust, force-aware manipulation across non-contact and contact phases with simple end-effector position control; experiments on wiping, peeling, and chopping demonstrate superior performance and robustness with only demonstrations per task. This approach offers a practical pathway to more dexterous, reliable manipulation in real-world settings by effectively leveraging force feedback in multimodal policies.

Abstract

Contact-rich tasks present significant challenges for robotic manipulation policies due to the complex dynamics of contact and the need for precise control. Vision-based policies often struggle with the skill required for such tasks, as they typically lack critical contact feedback modalities like force/torque information. To address this issue, we propose FoAR, a force-aware reactive policy that combines high-frequency force/torque sensing with visual inputs to enhance the performance in contact-rich manipulation. Built upon the RISE policy, FoAR incorporates a multimodal feature fusion mechanism guided by a future contact predictor, enabling dynamic adjustment of force/torque data usage between non-contact and contact phases. Its reactive control strategy also allows FoAR to accomplish contact-rich tasks accurately through simple position control. Experimental results demonstrate that FoAR significantly outperforms all baselines across various challenging contact-rich tasks while maintaining robust performance under unexpected dynamic disturbances. Project website: https://tonyfang.net/FoAR/

Paper Structure

This paper contains 17 sections, 2 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of the FoAR Policy for Contact-Rich Robotic Manipulations. Vision alone struggles to distinguish contact from non-contact states in contact-rich tasks, underscoring the need for integrating force/torque information. Our FoAR policy combines vision and force/torque inputs to predict robot actions along with a future contact probability $\phi$. Reactive control then refines actions dynamically based on current and predicted future contact states, enabling precise, force-aware manipulations for contact-rich tasks.
  • Figure 2: FoAR Architecture. FoAR consists of a point cloud encoder rise, a force/torque encoder, a future contact predictor, and a diffusion action head diffusionpolicy. The scene features and force features are fused under the guidance of the future contact predictor.
  • Figure 3: Tasks. We carefully design 3 challenging contact-rich tasks that focus on different aspects of the contact-rich manipulations. These tasks involve both non-contact phases and contact phases to evaluate the policy performance thoroughly.
  • Figure 4: Qualitative Results of the Peeling Task. The green background indicates the robot fully peels the cucumber, while yellow and red backgrounds represent partial peeling and peeling failure, respectively.
  • Figure 5: Evaluation Metrics of the Chopping Task. We encourage the robot to divide the pepper into several uniform small segments, without segments sticking together due to partial cuts.
  • ...and 2 more figures