Table of Contents
Fetching ...

Sampling-Based Model Predictive Control for Dexterous Manipulation on a Biomimetic Tendon-Driven Hand

Adrian Hess, Alexander M. Kübler, Benedek Forrai, Mehmet Dogar, Robert K. Katzschmann

TL;DR

Dexterous in-hand manipulation with biomimetic tendon-driven hands is hard due to high dimensionality and uncertain state. The authors combine sampling-based MPC (MuJoCo) with a visual-language model (GPT-4o) to autonomously adapt task-specific objective weights from video feedback, enabling rapid, retraining-free control. They demonstrate ball rolling, flipping, and catching in simulation and on physical hardware, including scenarios with a robotic arm, and show that a few adaptation cycles suffice to achieve functional dexterity. The work bridges simulation and real-world deployment, offering a flexible framework for rapid development of dexterous manipulation skills on compliant hands.

Abstract

Biomimetic and compliant robotic hands offer the potential for human-like dexterity, but controlling them is challenging due to high dimensionality, complex contact interactions, and uncertainties in state estimation. Sampling-based model predictive control (MPC), using a physics simulator as the dynamics model, is a promising approach for generating contact-rich behavior. However, sampling-based MPC has yet to be evaluated on physical (non-simulated) robotic hands, particularly on compliant hands with state uncertainties. We present the first successful demonstration of in-hand manipulation on a physical biomimetic tendon-driven robot hand using sampling-based MPC. While sampling-based MPC does not require lengthy training cycles like reinforcement learning approaches, it still necessitates adapting the task-specific objective function to ensure robust behavior execution on physical hardware. To adapt the objective function, we integrate a visual language model (VLM) with a real-time optimizer (MuJoCo MPC). We provide the VLM with a high-level human language description of the task and a video of the hand's current behavior. The VLM gradually adapts the objective function, allowing for efficient behavior generation, with each iteration taking less than two minutes. We show the feasibility of ball rolling, flipping, and catching using both simulated and physical robot hands. Our results demonstrate that sampling-based MPC is a promising approach for generating dexterous manipulation skills on biomimetic hands without extensive training cycles.

Sampling-Based Model Predictive Control for Dexterous Manipulation on a Biomimetic Tendon-Driven Hand

TL;DR

Dexterous in-hand manipulation with biomimetic tendon-driven hands is hard due to high dimensionality and uncertain state. The authors combine sampling-based MPC (MuJoCo) with a visual-language model (GPT-4o) to autonomously adapt task-specific objective weights from video feedback, enabling rapid, retraining-free control. They demonstrate ball rolling, flipping, and catching in simulation and on physical hardware, including scenarios with a robotic arm, and show that a few adaptation cycles suffice to achieve functional dexterity. The work bridges simulation and real-world deployment, offering a flexible framework for rapid development of dexterous manipulation skills on compliant hands.

Abstract

Biomimetic and compliant robotic hands offer the potential for human-like dexterity, but controlling them is challenging due to high dimensionality, complex contact interactions, and uncertainties in state estimation. Sampling-based model predictive control (MPC), using a physics simulator as the dynamics model, is a promising approach for generating contact-rich behavior. However, sampling-based MPC has yet to be evaluated on physical (non-simulated) robotic hands, particularly on compliant hands with state uncertainties. We present the first successful demonstration of in-hand manipulation on a physical biomimetic tendon-driven robot hand using sampling-based MPC. While sampling-based MPC does not require lengthy training cycles like reinforcement learning approaches, it still necessitates adapting the task-specific objective function to ensure robust behavior execution on physical hardware. To adapt the objective function, we integrate a visual language model (VLM) with a real-time optimizer (MuJoCo MPC). We provide the VLM with a high-level human language description of the task and a video of the hand's current behavior. The VLM gradually adapts the objective function, allowing for efficient behavior generation, with each iteration taking less than two minutes. We show the feasibility of ball rolling, flipping, and catching using both simulated and physical robot hands. Our results demonstrate that sampling-based MPC is a promising approach for generating dexterous manipulation skills on biomimetic hands without extensive training cycles.

Paper Structure

This paper contains 20 sections, 4 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Our system accepts a human language description of the task, which is used by a VLM to adapt the objective function of a model predictive controller. We show demonstrations of in-hand ball rolling and ball flipping, both in simulation and on the physical robot hand. Different timestamps are used to display the results in the simulated and real environments.
  • Figure 2: We propose the following pipeline to integrate VLMs with sampling-based MPC to control physical robot hands.
  • Figure 3: We demonstrate ball rolling (top), ball flipping (middle) and flipping with a robot arm in simulation (bottom).
  • Figure 4: We demonstrate successful ball rolling, ball flipping and ball catching on a physical robot hand. Please also see our video: https://youtu.be/6ivbd_jijHA.