Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning
Wenlong Huang, Igor Mordatch, Pieter Abbeel, Deepak Pathak
TL;DR
This work tackles generalization in in-hand dexterous manipulation by proposing a geometry-aware multi-task learning framework. A vanilla multi-task RL policy trained across many objects remains competitive with single-task oracles and benefits dramatically from a frozen point-cloud encoder that encodes object geometry, enabling strong zero-shot generalization to unseen shapes and sizes. The approach demonstrates robust performance across over 100 real-world objects, with a clear scaling effect as more objects are included and the representation is leveraged, often outperforming object-specific baselines on held-out objects. The authors release a simulated 114-object benchmark to spur future research and highlight practical design insights, such as freezing the encoder to preserve geometry-sensitive representations. Overall, the work advances toward a general-purpose dexterous manipulation controller capable of adapting to diverse objects with minimal task-specific tailoring.
Abstract
Dexterous manipulation of arbitrary objects, a fundamental daily task for humans, has been a grand challenge for autonomous robotic systems. Although data-driven approaches using reinforcement learning can develop specialist policies that discover behaviors to control a single object, they often exhibit poor generalization to unseen ones. In this work, we show that policies learned by existing reinforcement learning algorithms can in fact be generalist when combined with multi-task learning and a well-chosen object representation. We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects and generalize to new objects with unseen shape or size. Interestingly, we find that multi-task learning with object point cloud representations not only generalizes better but even outperforms the single-object specialist policies on both training as well as held-out test objects. Video results at https://huangwl18.github.io/geometry-dex
