Table of Contents
Fetching ...

Learning Dexterous Manipulation Skills from Imperfect Simulations

Elvis Hsieh, Wen-Han Hsieh, Yen-Jen Wang, Toru Lin, Jitendra Malik, Koushil Sreenath, Haozhi Qi

TL;DR

DexScrew tackles the sim-to-real gap in dexterous manipulation by bootstrapping from a simplified simulation to learn rotational finger gaits, then collecting real-world multisensory demonstrations via skill-based teleoperation, and finally training a tactile-aware behavior-cloning policy. The approach yields robust, generalizable manipulation for nut-bolt fastening and screwdriver tasks, outperforming direct sim-to-real transfer and showing strong performance on unseen geometries and under perturbations. Key findings highlight the necessity of tactile feedback and temporal history for stable, efficient manipulation. This staged pipeline offers a practical, scalable path toward dexterous manipulation with general-purpose robotic hands in unstructured environments.

Abstract

Reinforcement learning and sim-to-real transfer have made significant progress in dexterous manipulation. However, progress remains limited by the difficulty of simulating complex contact dynamics and multisensory signals, especially tactile feedback. In this work, we propose \ours, a sim-to-real framework that addresses these limitations and demonstrates its effectiveness on nut-bolt fastening and screwdriving with multi-fingered hands. The framework has three stages. First, we train reinforcement learning policies in simulation using simplified object models that lead to the emergence of correct finger gaits. We then use the learned policy as a skill primitive within a teleoperation system to collect real-world demonstrations that contain tactile and proprioceptive information. Finally, we train a behavior cloning policy that incorporates tactile sensing and show that it generalizes to nuts and screwdrivers with diverse geometries. Experiments across both tasks show high task progress ratios compared to direct sim-to-real transfer and robust performance even on unseen object shapes and under external perturbations. Videos and code are available on https://dexscrew.github.io.

Learning Dexterous Manipulation Skills from Imperfect Simulations

TL;DR

DexScrew tackles the sim-to-real gap in dexterous manipulation by bootstrapping from a simplified simulation to learn rotational finger gaits, then collecting real-world multisensory demonstrations via skill-based teleoperation, and finally training a tactile-aware behavior-cloning policy. The approach yields robust, generalizable manipulation for nut-bolt fastening and screwdriver tasks, outperforming direct sim-to-real transfer and showing strong performance on unseen geometries and under perturbations. Key findings highlight the necessity of tactile feedback and temporal history for stable, efficient manipulation. This staged pipeline offers a practical, scalable path toward dexterous manipulation with general-purpose robotic hands in unstructured environments.

Abstract

Reinforcement learning and sim-to-real transfer have made significant progress in dexterous manipulation. However, progress remains limited by the difficulty of simulating complex contact dynamics and multisensory signals, especially tactile feedback. In this work, we propose \ours, a sim-to-real framework that addresses these limitations and demonstrates its effectiveness on nut-bolt fastening and screwdriving with multi-fingered hands. The framework has three stages. First, we train reinforcement learning policies in simulation using simplified object models that lead to the emergence of correct finger gaits. We then use the learned policy as a skill primitive within a teleoperation system to collect real-world demonstrations that contain tactile and proprioceptive information. Finally, we train a behavior cloning policy that incorporates tactile sensing and show that it generalizes to nuts and screwdrivers with diverse geometries. Experiments across both tasks show high task progress ratios compared to direct sim-to-real transfer and robust performance even on unseen object shapes and under external perturbations. Videos and code are available on https://dexscrew.github.io.

Paper Structure

This paper contains 16 sections, 3 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: An overview of our approach. We first train a reinforcement learning policy in simulation using a simplified object model, which serves as a motion prior for nut-bolt fastening and screwdriving. We then collect real-world trajectories by using the learned policy as a skill primitive during teleoperation. Finally, we train a behavior cloning policy on the collected data to obtain coordinated behavior between the arm and the fingers.
  • Figure 2: Simplified Object Models. Each nut or handle is modeled as a rigid body attached to a fixed base through a revolute joint. This abstraction ignores thread-level mechanics while retaining the essential rotational dynamics needed for learning.
  • Figure 3: Teleoperation Interface. The human operator controls the wrist position using the VR controller buttons and adjusts yaw and pitch through the joystick. This setup allows the operator to guide the arm motion while relying on the learned finger-rotation skill during data collection.
  • Figure 4: Top: The policy with tactile information maintains a consistent alternating pattern of thumb and index finger contact, which supports stable engagement as the nut is rotated downward. Bottom: The policy without tactile information does not maintain a clear contact pattern. This leads to unsuccessful engagement and prevents proper downward wrist motion. The resulting pattern reflects the index finger pressing against the bolt after losing stable contact.
  • Figure 5: Top row: The policy recovers back to the nut-bolt fastening motion when the fingers are dragged by an external force. Bottom row: The policy recovers back to the screwdriving motion when the screwdriver is rotated counterclockwise during the clockwise rotation by the policy.
  • ...and 1 more figures