Table of Contents
Fetching ...

A System for General In-Hand Object Re-Orientation

Tao Chen, Jie Xu, Pulkit Agrawal

TL;DR

This work tackles in-hand object reorientation with a multi-finger hand across upright and downward orientations, proposing a model-free reinforcement learning framework built on teacher-student learning, gravity curriculum, and robust object initialization. A privileged teacher policy is trained with full-state information using PPO, then distilled into student policies that operate on reduced state or RGBD-based inputs, enabling generalization to thousands of object geometries without explicit object models. The approach achieves high success on diverse objects in simulation, demonstrates notable zero-shot transfer across object datasets, and shows promise for real-world deployment via domain randomization and vision-based inputs. The findings reveal that shape information is not strictly necessary for broad reorientation performance, and they identify practical strategies such as table support, gravity curriculum, and pose initialization to improve learning in challenging downward-hand scenarios, with clear pathways toward real-world realization.

Abstract

In-hand object reorientation has been a challenging problem in robotics due to high dimensional actuation space and the frequent change in contact state between the fingers and the objects. We present a simple model-free framework that can learn to reorient objects with both the hand facing upwards and downwards. We demonstrate the capability of reorienting over 2000 geometrically different objects in both cases. The learned policies show strong zero-shot transfer performance on new objects. We provide evidence that these policies are amenable to real-world operation by distilling them to use observations easily available in the real world. The videos of the learned policies are available at: https://taochenshh.github.io/projects/in-hand-reorientation.

A System for General In-Hand Object Re-Orientation

TL;DR

This work tackles in-hand object reorientation with a multi-finger hand across upright and downward orientations, proposing a model-free reinforcement learning framework built on teacher-student learning, gravity curriculum, and robust object initialization. A privileged teacher policy is trained with full-state information using PPO, then distilled into student policies that operate on reduced state or RGBD-based inputs, enabling generalization to thousands of object geometries without explicit object models. The approach achieves high success on diverse objects in simulation, demonstrates notable zero-shot transfer across object datasets, and shows promise for real-world deployment via domain randomization and vision-based inputs. The findings reveal that shape information is not strictly necessary for broad reorientation performance, and they identify practical strategies such as table support, gravity curriculum, and pose initialization to improve learning in challenging downward-hand scenarios, with clear pathways toward real-world realization.

Abstract

In-hand object reorientation has been a challenging problem in robotics due to high dimensional actuation space and the frequent change in contact state between the fingers and the objects. We present a simple model-free framework that can learn to reorient objects with both the hand facing upwards and downwards. We demonstrate the capability of reorienting over 2000 geometrically different objects in both cases. The learned policies show strong zero-shot transfer performance on new objects. We provide evidence that these policies are amenable to real-world operation by distilling them to use observations easily available in the real world. The videos of the learned policies are available at: https://taochenshh.github.io/projects/in-hand-reorientation.

Paper Structure

This paper contains 45 sections, 2 equations, 14 figures, 11 tables, 1 algorithm.

Figures (14)

  • Figure 1: We present a simple framework for learning policies for reorienting a large number of objects in scenarios where the (1) hand faces upward, (2) hand faces downward with a table below the hand and (3) without the support of the table. The object orientation in the rightmost image in each row shows the target orientation.
  • Figure 2: Visual policy architecture. MK stands for Minkowski Engine. $q_t$ is the joint positions and $a_t$ is the action at time step $t$.
  • Figure 3: Examples of initial poses of the hand and object. (a): hand faces upward. (b), (c), (d): hand faces downward. (b): both the hand and the object are initialized with random poses . (c): there is a table below the hand. (d): the hand and the object are initialized from the lifted poses.
  • Figure B.1: We learn policies that can reorient many objects in three scenarios respectively: (1) hand faces upward, (2) hand faces downward with a table below the hand, (3) hand faces downward without any table. The extra object in each figure shows the desired orientation.
  • Figure B.2: First row: examples of EGAD objects. Second row: examples of YCB objects.
  • ...and 9 more figures