DexTOG: Learning Task-Oriented Dexterous Grasp with Language

Jieyi Zhang; Wenqiang Xu; Zhenjun Yu; Pengfei Xie; Tutian Tang; Cewu Lu

DexTOG: Learning Task-Oriented Dexterous Grasp with Language

Jieyi Zhang, Wenqiang Xu, Zhenjun Yu, Pengfei Xie, Tutian Tang, Cewu Lu

TL;DR

The paper introduces DexTOG, a language-guided diffusion framework for task-oriented grasping with dexterous hands, addressing the challenges of multi-modal optimal grasps and high DoF configuration spaces. DiffuTOG generates task-aware grasp poses conditioned on 3D object observations, hand geometry, and natural language task descriptions, with a test-time collision refinement. A data engine DexTOG produces the DexTOG-80K dataset by bootstrapping DiffuTOG with heuristic filtering and reinforcement learning validation across five articulated tasks on 80 objects, enabling both TOG and task-agnostic evaluation. Experimental results in simulation show improvements over baselines in both task-agnostic and task-oriented settings, with ablations highlighting the contributions of hand geometry, collision handling, and RL-driven verification. The work contributes a scalable data generation pipeline, a diffusion-based grasp model, and a comprehensive dataset to advance dexterous TOG research and manipulation benchmarks.

Abstract

This study introduces a novel language-guided diffusion-based learning framework, DexTOG, aimed at advancing the field of task-oriented grasping (TOG) with dexterous hands. Unlike existing methods that mainly focus on 2-finger grippers, this research addresses the complexities of dexterous manipulation, where the system must identify non-unique optimal grasp poses under specific task constraints, cater to multiple valid grasps, and search in a high degree-of-freedom configuration space in grasp planning. The proposed DexTOG includes a diffusion-based grasp pose generation model, DexDiffu, and a data engine to support the DexDiffu. By leveraging DexTOG, we also proposed a new dataset, DexTOG-80K, which was developed using a shadow robot hand to perform various tasks on 80 objects from 5 categories, showcasing the dexterity and multi-tasking capabilities of the robotic hand. This research not only presents a significant leap in dexterous TOG but also provides a comprehensive dataset and simulation validation, setting a new benchmark in robotic manipulation research.

DexTOG: Learning Task-Oriented Dexterous Grasp with Language

TL;DR

Abstract

DexTOG: Learning Task-Oriented Dexterous Grasp with Language

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)