Retrieval Dexterity: Efficient Object Retrieval in Clutters with Dexterous Hand
Fengshuo Bai, Yu Li, Jie Chu, Tawei Chou, Runchuan Zhu, Ying Wen, Yaodong Yang, Yuanpei Chen
TL;DR
Retrieval Dexterity addresses efficient object retrieval in cluttered environments using a dexterous, multi-finger hand. It trains policies in large-scale simulation with diverse clutter to learn emergent manipulation strategies (e.g., pushing, stirring) for occluder clearing, and demonstrates zero-shot transfer to real robots via a sim-to-real pipeline based on Behavior Cloning and a transformer-based policy. The approach relies on a pixel-visibility reward, domain randomization, and a comprehensive task-construction pipeline to achieve robust generalization to unseen objects and clutter, with real-world experiments showing substantial improvements in retrieval efficiency over baselines. This work advances practical robotic manipulation in clutter, enabling faster, more reliable object retrieval in domestic and industrial contexts.
Abstract
Retrieving objects buried beneath multiple objects is not only challenging but also time-consuming. Performing manipulation in such environments presents significant difficulty due to complex contact relationships. Existing methods typically address this task by sequentially grasping and removing each occluding object, resulting in lengthy execution times and requiring impractical grasping capabilities for every occluding object. In this paper, we present a dexterous arm-hand system for efficient object retrieval in multi-object stacked environments. Our approach leverages large-scale parallel reinforcement learning within diverse and carefully designed cluttered environments to train policies. These policies demonstrate emergent manipulation skills (e.g., pushing, stirring, and poking) that efficiently clear occluding objects to expose sufficient surface area of the target object. We conduct extensive evaluations across a set of over 10 household objects in diverse clutter configurations, demonstrating superior retrieval performance and efficiency for both trained and unseen objects. Furthermore, we successfully transfer the learned policies to a real-world dexterous multi-fingered robot system, validating their practical applicability in real-world scenarios. Videos can be found on our project website https://ChangWinde.github.io/RetrDex.
