Table of Contents
Fetching ...

A Contact-Safe Reinforcement Learning Framework for Contact-Rich Robot Manipulation

Xiang Zhu, Shucheng Kang, Jianyu Chen

TL;DR

The paper addresses safety in reinforcement learning for contact-rich robot manipulation by introducing a hierarchical, contact-safe framework that enforces task-space safety via Cartesian variable-impedance control and joint-space safety via a momentum-observer–based contact detector and a contact-aware controller. The RL policy is trained in simulation (with a VAE for image embeddings) and deployed on a real Franka Panda, achieving compliant contacts (end-effector forces well below the safety threshold) and strong disturbance rejection. Key innovations include a momentum observer for collision detection, a disturbance compensation impedance, and a task-consistent null-space projection under contact, enabling safe sim-to-real transfer for wiping tasks. The results demonstrate significantly reduced contact forces and improved robustness to unseen disturbances, indicating practical impact for deploying RL in real-world contact-rich manipulation scenarios.

Abstract

Reinforcement learning shows great potential to solve complex contact-rich robot manipulation tasks. However, the safety of using RL in the real world is a crucial problem, since unexpected dangerous collisions might happen when the RL policy is imperfect during training or in unseen scenarios. In this paper, we propose a contact-safe reinforcement learning framework for contact-rich robot manipulation, which maintains safety in both the task space and joint space. When the RL policy causes unexpected collisions between the robot arm and the environment, our framework is able to immediately detect the collision and ensure the contact force to be small. Furthermore, the end-effector is enforced to perform contact-rich tasks compliantly, while keeping robust to external disturbances. We train the RL policy in simulation and transfer it to the real robot. Real world experiments on robot wiping tasks show that our method is able to keep the contact force small both in task space and joint space even when the policy is under unseen scenario with unexpected collision, while rejecting the disturbances on the main task.

A Contact-Safe Reinforcement Learning Framework for Contact-Rich Robot Manipulation

TL;DR

The paper addresses safety in reinforcement learning for contact-rich robot manipulation by introducing a hierarchical, contact-safe framework that enforces task-space safety via Cartesian variable-impedance control and joint-space safety via a momentum-observer–based contact detector and a contact-aware controller. The RL policy is trained in simulation (with a VAE for image embeddings) and deployed on a real Franka Panda, achieving compliant contacts (end-effector forces well below the safety threshold) and strong disturbance rejection. Key innovations include a momentum observer for collision detection, a disturbance compensation impedance, and a task-consistent null-space projection under contact, enabling safe sim-to-real transfer for wiping tasks. The results demonstrate significantly reduced contact forces and improved robustness to unseen disturbances, indicating practical impact for deploying RL in real-world contact-rich manipulation scenarios.

Abstract

Reinforcement learning shows great potential to solve complex contact-rich robot manipulation tasks. However, the safety of using RL in the real world is a crucial problem, since unexpected dangerous collisions might happen when the RL policy is imperfect during training or in unseen scenarios. In this paper, we propose a contact-safe reinforcement learning framework for contact-rich robot manipulation, which maintains safety in both the task space and joint space. When the RL policy causes unexpected collisions between the robot arm and the environment, our framework is able to immediately detect the collision and ensure the contact force to be small. Furthermore, the end-effector is enforced to perform contact-rich tasks compliantly, while keeping robust to external disturbances. We train the RL policy in simulation and transfer it to the real robot. Real world experiments on robot wiping tasks show that our method is able to keep the contact force small both in task space and joint space even when the policy is under unseen scenario with unexpected collision, while rejecting the disturbances on the main task.
Paper Structure (20 sections, 27 equations, 5 figures)

This paper contains 20 sections, 27 equations, 5 figures.

Figures (5)

  • Figure 1: Safety under unseen scenario: Our trained RL agent can finish the wiping task properly (shown in the top left figure). However, when an unseen object appears (shown in the top right figure), the RL policy fails and unexpectedly command the robot to move right. Such behavior might cause some unexpected contact. In this experiment, the robot collides with a plastic water bottle we grab. The green border pictures show that our method generate compliant contact behavior with small contact force. However the baseline, as shown in the red border figures, generate nearly 3 times the contact force of our method. As we can see, the baseline method significantly deforms the water bottle while our method does not deform it at all. The bottom left two figures show the contact force using a thrush meter and the bottom right two figures show the deformation of the water bottle.
  • Figure 2: Framework Overview: Our method consists of a low-rate RL policy and a high-rate controller. The RL policy takes the image and the robot eff-effector pose as input, and outputs task space information. We use inverse kinematics to define the posture task. The controller consists of a contact-aware controller and a free space controller while using momentum observer as the contact detector. The controller switches to the contact-aware controller under contact, and switches to the other on the contrary. The final command torque will be sent to the robot.
  • Figure 3: Task Space Safety: The bottom three figures plot the 3-axis end-effector force when the robot wipes the whiteboard. The safe task contact is defined as the contact force under $40N$. The maximum force of our methods is about $5N$ which is far less than the safe contact threshold, while performing the wiping task successfully. The black dashed lines mark the time when the situation shown in the top images occurred and other colored dashed line plots the result of the baseline.
  • Figure 4: Joint Space Safety: The bottom two figures plot the estimated external joint contact force induced by unexpected contact at the robot arm.
  • Figure 5: Task Disturbance Rejection: The blue and orange arrows show the disturbance force applied by human. The blue one applied first while the orange one applied next. The bottom right figure shows the end-effector position error of our method under the unexpected contact. The bottom left one shows the result of the baseline.