MoDex: Planning High-Dimensional Dexterous Control via Learning Neural Internal Models
Tong Wu, Shoujie Li, Chuqiao Lyu, Kit-Wa Sou, Wang-Sing Chan, Wenbo Ding
TL;DR
MoDex tackles high-dimensional dexterous control by learning neural internal models of hand dynamics (forward and inverse) and pairing them with a bidirectional planning loop based on Cross-Entropy Method. The framework supports data-efficient planning, a factorized dynamics variant for in-hand manipulation, and few-shot gesture generation via LLM-generated costs, validated across multiple hands in simulation and real-world deployment. Key contributions include the neural internal-model formulation, bidirectional planning, factorized dynamics, and LLM-assisted gesture generation, demonstrating improved data efficiency and transfer across tasks. This work advances scalable, high-DoF dexterous control with practical impact on manipulation and gesture synthesis in both simulated and real environments.
Abstract
Controlling hands in high-dimensional action space has been a longstanding challenge, yet humans naturally perform dexterous tasks with ease. In this paper, we draw inspiration from the concept of internal model exhibited in human behavior and reconsider dexterous hands as learnable systems. Specifically, we introduce MoDex, a framework that includes a couple of neural networks (NNs) capturing the dynamical characteristics of hands and a bidirectional planning approach, which demonstrates both training and planning efficiency. To show the versatility of MoDex, we further integrate it with an external model to manipulate in-hand objects and a large language model (LLM) to generate various gestures in both simulation and real world. Extensive experiments on different dexterous hands address the data efficiency in learning a new task and the transferability between different tasks.
