Table of Contents
Fetching ...

In-Hand Manipulation of Articulated Tools with Dexterous Robot Hands with Sim-to-Real Transfer

Soofiyan Atar, Daniel Huang, Florian Richter, Michael Yip

TL;DR

This work tackles the challenge of in-hand manipulation of articulated tools under sim-to-real transfer, where contact-rich dynamics and joint phenomena hinder policy robustness. The authors propose a three-stage approach: train a privileged oracle policy in simulation, distill it into a proprioceptive base policy, and introduce a Cross-Attention Tactile Force Adaptation (CATFA) module that fuses tactile and motor-force feedback with the policy's intent for online hardware adaptation. CATFA enables fast adaptation to real dynamics, stabilizes contact interactions, and improves disturbance resilience, demonstrated across scissors, pliers, surgical tools, and staplers with robust transfer and generalization to unseen tools. The work highlights the value of proprioceptive transfer complemented by minimal real-world tactile data, offering a scalable pathway for dexterous manipulation in human-centric environments.

Abstract

Reinforcement learning (RL) and sim-to-real transfer have advanced robotic manipulation of rigid objects. Yet, policies remain brittle when applied to articulated mechanisms due to contact-rich dynamics and under-modeled joint phenomena such as friction, stiction, backlash, and clearances. We address this challenge through dexterous in-hand manipulation of articulated tools using a robotic hand with reduced articulation and kinematic redundancy relative to the human hand. Our controller augments a simulation-trained base policy with a sensor-driven refinement learned from hardware demonstrations, conditioning on proprioception and target articulation states while fusing whole-hand tactile and force feedback with the policy's internal action intent via cross-attention-based integration. This design enables online adaptation to instance-specific articulation properties, stabilizes contact interactions, regulates internal forces, and coordinates coupled-link motion under perturbations. We validate our approach across a diversity of real-world examples, including scissors, pliers, minimally invasive surgical tools, and staplers. We achieve robust transfer from simulation to hardware, improved disturbance resilience, and generalization to previously unseen articulated tools, thereby reducing reliance on precise physical modeling in contact-rich settings.

In-Hand Manipulation of Articulated Tools with Dexterous Robot Hands with Sim-to-Real Transfer

TL;DR

This work tackles the challenge of in-hand manipulation of articulated tools under sim-to-real transfer, where contact-rich dynamics and joint phenomena hinder policy robustness. The authors propose a three-stage approach: train a privileged oracle policy in simulation, distill it into a proprioceptive base policy, and introduce a Cross-Attention Tactile Force Adaptation (CATFA) module that fuses tactile and motor-force feedback with the policy's intent for online hardware adaptation. CATFA enables fast adaptation to real dynamics, stabilizes contact interactions, and improves disturbance resilience, demonstrated across scissors, pliers, surgical tools, and staplers with robust transfer and generalization to unseen tools. The work highlights the value of proprioceptive transfer complemented by minimal real-world tactile data, offering a scalable pathway for dexterous manipulation in human-centric environments.

Abstract

Reinforcement learning (RL) and sim-to-real transfer have advanced robotic manipulation of rigid objects. Yet, policies remain brittle when applied to articulated mechanisms due to contact-rich dynamics and under-modeled joint phenomena such as friction, stiction, backlash, and clearances. We address this challenge through dexterous in-hand manipulation of articulated tools using a robotic hand with reduced articulation and kinematic redundancy relative to the human hand. Our controller augments a simulation-trained base policy with a sensor-driven refinement learned from hardware demonstrations, conditioning on proprioception and target articulation states while fusing whole-hand tactile and force feedback with the policy's internal action intent via cross-attention-based integration. This design enables online adaptation to instance-specific articulation properties, stabilizes contact interactions, regulates internal forces, and coordinates coupled-link motion under perturbations. We validate our approach across a diversity of real-world examples, including scissors, pliers, minimally invasive surgical tools, and staplers. We achieve robust transfer from simulation to hardware, improved disturbance resilience, and generalization to previously unseen articulated tools, thereby reducing reliance on precise physical modeling in contact-rich settings.

Paper Structure

This paper contains 13 sections, 6 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Left—open–close mechanics of scissors and pliers. Right—real-world and simulation grasp poses, showing alignment of digital clones with physical interactions.
  • Figure 2: Pipeline for articulated in-hand manipulation. (a) Oracle training with privileged observations in simulation, followed by distillation into a proprioceptive base policy and data collection with additional tactile $f^{\text{tact}}_t$ and motor torque $\tau^{\text{motor}}_t$ observations. (b) CATFA fuses these modalities with policy intent via cross-attention for online adaptation on hardware.
  • Figure 3: Pose norm (top) and quaternion norm (bottom) of the articulated tool (surgical clamp). Blue: policy with perturbation optimization, which corresponds to (b); Orange: policy without random-walk perturbations, which corresponds to (a)
  • Figure 4: Inspire hand with augmented tactile structure: a 3D-printed pad and foam layer enhance sensitivity, with motor and mimic joints shown. Motor torques $\tau^{\text{motor}}_t$ are computed only from the active joints.
  • Figure 5: Articulated tools used in experiments: real-world examples (top) and simulated counterparts (bottom). The axis denotes the articulation rotation axis. These tools span a wide range of articulation types and gripping strategies, capturing diverse kinematic structures and manipulation provisions.
  • ...and 1 more figures