Table of Contents
Fetching ...

Tac2Motion: Contact-Aware Reinforcement Learning with Tactile Feedback for Robotic Hand Manipulation

Yitaek Kim, Casper Hewson Rask, Christoffer Sloth

TL;DR

Tac2Motion tackles contact-rich in-hand manipulation by integrating tactile sensing into both observation space and reward design, enabling a contact-aware policy to grasp firmly while reconfiguring fingers for smooth gaiting. The approach introduces tactile-based rewards (CPR, CRR, RR) and penalty terms, plus a virtual-torque mechanism to emulate patch contact, trained with PPO and demonstrated on lid-opening tasks. Results show faster data-efficient learning, better generalization across lid geometries, and successful sim-to-real transfer to a Shadow Hand on a UR10e. This work advances tactile-informed reinforcement learning for dexterous manipulation, with practical impact on robust manipulation under uncertain dynamics.

Abstract

This paper proposes Tac2Motion, a contact-aware reinforcement learning framework to facilitate the learning of contact-rich in-hand manipulation tasks, such as removing a lid. To this end, we propose tactile sensing-based reward shaping and incorporate the sensing into the observation space through embedding. The designed rewards encourage an agent to ensure firm grasping and smooth finger gaiting at the same time, leading to higher data efficiency and robust performance compared to the baseline. We verify the proposed framework on the opening a lid scenario, showing generalization of the trained policy into a couple of object types and various dynamics such as torsional friction. Lastly, the learned policy is demonstrated on the multi-fingered robot, Shadow Robot, showing that the control policy can be transferred to the real world. The video is available: https://youtu.be/poeJBPR7urQ.

Tac2Motion: Contact-Aware Reinforcement Learning with Tactile Feedback for Robotic Hand Manipulation

TL;DR

Tac2Motion tackles contact-rich in-hand manipulation by integrating tactile sensing into both observation space and reward design, enabling a contact-aware policy to grasp firmly while reconfiguring fingers for smooth gaiting. The approach introduces tactile-based rewards (CPR, CRR, RR) and penalty terms, plus a virtual-torque mechanism to emulate patch contact, trained with PPO and demonstrated on lid-opening tasks. Results show faster data-efficient learning, better generalization across lid geometries, and successful sim-to-real transfer to a Shadow Hand on a UR10e. This work advances tactile-informed reinforcement learning for dexterous manipulation, with practical impact on robust manipulation under uncertain dynamics.

Abstract

This paper proposes Tac2Motion, a contact-aware reinforcement learning framework to facilitate the learning of contact-rich in-hand manipulation tasks, such as removing a lid. To this end, we propose tactile sensing-based reward shaping and incorporate the sensing into the observation space through embedding. The designed rewards encourage an agent to ensure firm grasping and smooth finger gaiting at the same time, leading to higher data efficiency and robust performance compared to the baseline. We verify the proposed framework on the opening a lid scenario, showing generalization of the trained policy into a couple of object types and various dynamics such as torsional friction. Lastly, the learned policy is demonstrated on the multi-fingered robot, Shadow Robot, showing that the control policy can be transferred to the real world. The video is available: https://youtu.be/poeJBPR7urQ.

Paper Structure

This paper contains 16 sections, 10 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Outline of Tac2Motion including model training and transferring for the contact aware control policy.
  • Figure 2: Illustration of distance, $d^j_i$ between contact reference object base and tactile sensors. The contact reference object base are represented by the red boxes and the blue dots are tactile sensors.
  • Figure 3: Training performance of each method over 20M steps across different types, sizes, and object dynamics. The results show that Tac2Motion learns the torsional motion faster then other methods and achieves superior final performance.
  • Figure 4: Snapshots from experimental deployment of the learned control policy for removing the lid with a multi-fingered robotic hand.
  • Figure A.1: Types of the lid in the simulation. All lids have multiple virtual boxes on their rims as the contact guides.