Table of Contents
Fetching ...

Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger

Arthur Allshire, Mayank Mittal, Varun Lodaya, Viktor Makoviychuk, Denys Makoviichuk, Felix Widmaier, Manuel Wüthrich, Stefan Bauer, Ankur Handa, Animesh Garg

TL;DR

This work addresses closed-loop 6-DoF in-hand manipulation with a 3-finger TriFinger by training a single policy in high-speed IsaacGym simulation and transferring it to a remote real robot. It shows that representing the manipulated object with eight 3D keypoints, rather than raw position and quaternion, improves learning and reward shaping for reposing tasks, especially when combined with domain randomization. The resulting policies achieve robust sim-to-real transfer, reaching about 82–83% success on the real apparatus across object morphologies, with a scalable, open-source workflow. Overall, the study demonstrates a practical pathway for end-to-end RL in dexterous manipulation and real-world deployment, enabling reproducibility and extension by other researchers.

Abstract

We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward calculation to train a model-free reinforcement learning agent. By utilizing domain randomization strategies along with the keypoint representation of the pose of the manipulated object, we achieve a high success rate of 83% on a remote TriFinger system maintained by the organizers of the Real Robot Challenge. With the aim of assisting further research in learning in-hand manipulation, we make the codebase of our system, along with trained checkpoints that come with billions of steps of experience available, at https://s2r2-ig.github.io

Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger

TL;DR

This work addresses closed-loop 6-DoF in-hand manipulation with a 3-finger TriFinger by training a single policy in high-speed IsaacGym simulation and transferring it to a remote real robot. It shows that representing the manipulated object with eight 3D keypoints, rather than raw position and quaternion, improves learning and reward shaping for reposing tasks, especially when combined with domain randomization. The resulting policies achieve robust sim-to-real transfer, reaching about 82–83% success on the real apparatus across object morphologies, with a scalable, open-source workflow. Overall, the study demonstrates a practical pathway for end-to-end RL in dexterous manipulation and real-world deployment, enabling reproducibility and extension by other researchers.

Abstract

We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward calculation to train a model-free reinforcement learning agent. By utilizing domain randomization strategies along with the keypoint representation of the pose of the manipulated object, we achieve a high success rate of 83% on a remote TriFinger system maintained by the organizers of the Real Robot Challenge. With the aim of assisting further research in learning in-hand manipulation, we make the codebase of our system, along with trained checkpoints that come with billions of steps of experience available, at https://s2r2-ig.github.io

Paper Structure

This paper contains 16 sections, 2 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Top: Our system learns to grasp and manipulate objects to 6-DoF goal poses with a single policy, entirely in simulation, across a variety of objects. Bottom: We then transfer to a real robot located thousands of kilometers away from where development work is done.
  • Figure 2: Previous setups for performing RL-based dexterous manipulation in the real world have relied on specialised hardware or configurations which may be impractical to scale. For example, OpenAI's work on Shadow Hand openai-sh started with the cube in hand (avoiding the need to learn to grasp it), relied on phase space tracking, and only set in-hand orientation goals (rather than full pose goals). In contrast, the Trifinger setup relies only on sensor inputs from RGB cameras and encoders in the fingers, and the object starts in a random position on the ground outside of the hand, yet our system achieves 6-DoF reposing on multiple objects across the workspace.
  • Figure 3: Our system trains using the IsaacGym simulatormakoviychuk2021isaac on 16,384 environments in parallel on a single NVIDIA Tesla V100 or RTX 3090 GPU. Inference is then conducted remotely on a TriFinger robot located across the Atlantic in Germany using the uploaded actor weights. The infrastructure on which we perform sim-to-real transfer is provided courtesy of the organisers of the Real Robot Challenge real-robot-challenge.
  • Figure 4: The actor and critic networks are parameterized using fully-connected layers with ELU activation functions clevert2015fast.
  • Figure 5: Training curves on a reward function similar to prior work trifinger-benchmarkingrrc-submission-chen for the setting with DR. We take the average of 5 seeds; the shaded areas show standard deviation, noting that curves for Orientation and Position+Orientation overlap during training. It is worth noting that the nature of the reward makes it very difficult for the policy to optimize, particularly achieving an orientation goal.
  • ...and 5 more figures