In-Hand Object Rotation via Rapid Motor Adaptation

Haozhi Qi; Ashish Kumar; Roberto Calandra; Yi Ma; Jitendra Malik

In-Hand Object Rotation via Rapid Motor Adaptation

Haozhi Qi, Ashish Kumar, Roberto Calandra, Yi Ma, Jitendra Malik

TL;DR

This work tackles generalized in-hand object rotation with a multi-fingered hand by training a base policy in simulation conditioned on a compact object-extrinsics embedding and pair it with a rapid online adaptation module that estimates these properties from proprioception history. The approach enables direct sim-to-real transfer to rotate dozens of diverse objects using only fingertip sensing and without real-world fine-tuning, while natural finger gaits emerge during training. Key contributions include the extrinsics-based adaptive policy, an adaptation module trained in simulation, and comprehensive analyses showing interpretable latent structure and robust generalization to out-of-distribution objects. The results demonstrate the viability of proprioception-only rapid adaptation for general in-hand manipulation, reducing reliance on vision or tactile sensing and advancing practical capabilities in dexterous robotics.

Abstract

Generalized in-hand manipulation has long been an unsolved challenge of robotics. As a small step towards this grand goal, we demonstrate how to design and learn a simple adaptive controller to achieve in-hand object rotation using only fingertips. The controller is trained entirely in simulation on only cylindrical objects, which then - without any fine-tuning - can be directly deployed to a real robot hand to rotate dozens of objects with diverse sizes, shapes, and weights over the z-axis. This is achieved via rapid online adaptation of the controller to the object properties using only proprioception history. Furthermore, natural and stable finger gaits automatically emerge from training the control policy via reinforcement learning. Code and more videos are available at https://haozhi.io/hora

In-Hand Object Rotation via Rapid Motor Adaptation

TL;DR

Abstract

Paper Structure (18 sections, 3 equations, 10 figures, 5 tables)

This paper contains 18 sections, 3 equations, 10 figures, 5 tables.

Introduction
Related Work
Rapid Motor Adaptation for In-Hand Object Rotation
Base Policy Training
Reward Function.
Object Initialization and Dynamics Randomization.
Adaptation Module Training
Experimental Setup and Implementation Details
Results and Analysis
Generalization via Adaptation
Understanding and Analysis
Real World Qualitative Results
Discussion and Limitations
Additional Results and Analysis
Per-Object Result Result Analysis.
...and 3 more sections

Figures (10)

Figure 1: Left: Our controller is trained only in simulation on simple cylindrical objects of different sizes and weights. Right: Without any real world fine-tuning, the controller can be deployed to a real robot on a diverse set of objects with different shapes, sizes and weights (object mass and the shortest/longest diameter axis length along the fingertips are shown in the figure) using only proprioceptive information. https://haozhi.io/hora/ Emergence of natural stable finger gaits can be observed in the learned control policy.
Figure 2: An overview of our approach at different training and deployment stages. In Base Policy Learning, we jointly optimize $\mu$ and $\pi$ using PPO schulman2017proximal. The observation $o_t$ only contains three past joint positions and commanded actions. Next, in Adaptation Module Learning, we freeze the policy $\pi$ and use supervised learning to train $\phi$ which uses proprioception and action history to estimate the extrinsics vector $\bm{z}_t$. During Deployment, the base policy $\pi$ uses the extrinsics $\hat{\bm z}_t$ estimated and updated online by $\phi$.
Figure 3: Quantitative Evaluation on a diverse set of Heavy Objects (Left). Our method which uses adaptation performs the best in terms of total rotated angle (in radians), Time To Fall (TTF), and Energy efficiency (Torque). The DR baseline has a conservative policy which results in a slower angular velocity. SysID has a more dynamic and agile policy but very unstable as can be seen in a lower TTF compared to DR and ours. The NoAdapt baseline fails on the task showing the importance of continuous online adaptation.
Figure 4: Quantitative Evaluation on a diverse set of Irregular Objects (Left). Our method can successfully generalize to rotating a diverse set of objects including objects with holes, soft and deformable objects (none of these were included in the training). Our method outperforms the baselines on all the metrics. The DR baseline has the second highest TTF but low Rotations because it outputs very conservative and slow trajectories. SysID achieves slightly faster but a very unstable policy. The superior performance of our method over baselines shows the importance of adaptation via a low dimensional extrinsics estimation for generalization in this task.
Figure 5: Two of the 8-dim estimated extrinsics vector during one continuous run in which we change the object in the hand every 30s for 6 objects. The top plot shows the extrinsic value $\bm{z}_{t,0}$ which responds to changes in object diameter with smaller diameters leading to lower values. Since the perceived diameter of the object changes during rotating an irregular shaped object, we see variations in this extrinsic value even within the same object. The bottom plot shows the correlation between $\bm{z}_{t,2}$ and object mass: it attains higher values for lighter objects and lower for heavier objects. See https://haozhi.io/hora/#adapt for the video.
...and 5 more figures

In-Hand Object Rotation via Rapid Motor Adaptation

TL;DR

Abstract

In-Hand Object Rotation via Rapid Motor Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)