Table of Contents
Fetching ...

DexSinGrasp: Learning a Unified Policy for Dexterous Object Singulation and Grasping in Densely Cluttered Environments

Lixin Xu, Zixuan Liu, Zhewei Gui, Jingxiang Guo, Zeyu Jiang, Tongzhou Zhang, Zhixuan Xu, Chongkai Gao, Lin Shao

TL;DR

DexSinGrasp tackles the challenge of grasping in densely cluttered environments by learning a unified policy that performs object singulation and grasping with a high-dof dexterous hand. The method combines a single RL objective, a clutter arrangement curriculum, and teacher–student policy distillation to produce a deployable vision-based policy. Key contributions include a unified reward design, dense-and-random clutter curricula, two teacher policies, and a distillation pipeline enabling zero-shot real-world transfer with a LEAP hand on an xArm6. Experimental results show superior grasping efficiency and success rates over baselines across dense, sparse, irregular, and real-world clutter, highlighting the importance of finger dexterity and curriculum-driven learning for cluttered manipulation.

Abstract

Grasping objects in cluttered environments remains a fundamental yet challenging problem in robotic manipulation. While prior works have explored learning-based synergies between pushing and grasping for two-fingered grippers, few have leveraged the high degrees of freedom (DoF) in dexterous hands to perform efficient singulation for grasping in cluttered settings. In this work, we introduce DexSinGrasp, a unified policy for dexterous object singulation and grasping. DexSinGrasp enables high-dexterity object singulation to facilitate grasping, significantly improving efficiency and effectiveness in cluttered environments. We incorporate clutter arrangement curriculum learning to enhance success rates and generalization across diverse clutter conditions, while policy distillation enables a deployable vision-based grasping strategy. To evaluate our approach, we introduce a set of cluttered grasping tasks with varying object arrangements and occlusion levels. Experimental results show that our method outperforms baselines in both efficiency and grasping success rate, particularly in dense clutter. Codes, appendix, and videos are available on our website https://nus-lins-lab.github.io/dexsingweb/.

DexSinGrasp: Learning a Unified Policy for Dexterous Object Singulation and Grasping in Densely Cluttered Environments

TL;DR

DexSinGrasp tackles the challenge of grasping in densely cluttered environments by learning a unified policy that performs object singulation and grasping with a high-dof dexterous hand. The method combines a single RL objective, a clutter arrangement curriculum, and teacher–student policy distillation to produce a deployable vision-based policy. Key contributions include a unified reward design, dense-and-random clutter curricula, two teacher policies, and a distillation pipeline enabling zero-shot real-world transfer with a LEAP hand on an xArm6. Experimental results show superior grasping efficiency and success rates over baselines across dense, sparse, irregular, and real-world clutter, highlighting the importance of finger dexterity and curriculum-driven learning for cluttered manipulation.

Abstract

Grasping objects in cluttered environments remains a fundamental yet challenging problem in robotic manipulation. While prior works have explored learning-based synergies between pushing and grasping for two-fingered grippers, few have leveraged the high degrees of freedom (DoF) in dexterous hands to perform efficient singulation for grasping in cluttered settings. In this work, we introduce DexSinGrasp, a unified policy for dexterous object singulation and grasping. DexSinGrasp enables high-dexterity object singulation to facilitate grasping, significantly improving efficiency and effectiveness in cluttered environments. We incorporate clutter arrangement curriculum learning to enhance success rates and generalization across diverse clutter conditions, while policy distillation enables a deployable vision-based grasping strategy. To evaluate our approach, we introduce a set of cluttered grasping tasks with varying object arrangements and occlusion levels. Experimental results show that our method outperforms baselines in both efficiency and grasping success rate, particularly in dense clutter. Codes, appendix, and videos are available on our website https://nus-lins-lab.github.io/dexsingweb/.

Paper Structure

This paper contains 22 sections, 3 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: We propose DexSinGrasp to learn a unified policy for dexterous object singulation and grasping in densely cluttered environments
  • Figure 2: Framework of DexSinGrasp. Firstly, we adopt clutter arrangement curriculum learning to progressively improve the performance of our teacher policy to address the challenge of training from scratch in dense or random clutter arrangements, and acquire two teacher policies for dense and random arrangement tasks, respectively. We then collect data with visual observation from these two teachers and finally train a vision-based student policy via behavior cloning, which better facilitates real-world deployment.
  • Figure 3: Dense and random arrangement settings. We introduce a cluttered environment generation module to create diverse object settings.
  • Figure 4: Qualitative results on object singulation and grasping in simulation.
  • Figure 5: Irregular Clutter. The target object is marked as purple in the center.
  • ...and 1 more figures