Table of Contents
Fetching ...

FAR-Dex: Few-shot Data Augmentation and Adaptive Residual Policy Refinement for Dexterous Manipulation

Yushan Bai, Fulin Chen, Hongzheng Sun, Yuchuang Tong, En Li, Zhengtao Zhang

TL;DR

FAR-Dex, a hierarchical framework that integrates few-shot data augmentation with adaptive residual refinement to enable robust and precise arm-hand coordination in dexterous tasks, is proposed, enabling fine-grained dexterous manipulation with strong positional generalization.

Abstract

Achieving human-like dexterous manipulation through the collaboration of multi-fingered hands with robotic arms remains a longstanding challenge in robotics, primarily due to the scarcity of high-quality demonstrations and the complexity of high-dimensional action spaces. To address these challenges, we propose FAR-Dex, a hierarchical framework that integrates few-shot data augmentation with adaptive residual refinement to enable robust and precise arm-hand coordination in dexterous tasks. First, FAR-DexGen leverages the IsaacLab simulator to generate diverse and physically constrained trajectories from a few demonstrations, providing a data foundation for policy training. Second, FAR-DexRes introduces an adaptive residual module that refines policies by combining multi-step trajectory segments with observation features, thereby enhancing accuracy and robustness in manipulation scenarios. Experiments in both simulation and real-world demonstrate that FAR-Dex improves data quality by 13.4% and task success rates by 7% over state-of-the-art methods. It further achieves over 80% success in real-world tasks, enabling fine-grained dexterous manipulation with strong positional generalization.

FAR-Dex: Few-shot Data Augmentation and Adaptive Residual Policy Refinement for Dexterous Manipulation

TL;DR

FAR-Dex, a hierarchical framework that integrates few-shot data augmentation with adaptive residual refinement to enable robust and precise arm-hand coordination in dexterous tasks, is proposed, enabling fine-grained dexterous manipulation with strong positional generalization.

Abstract

Achieving human-like dexterous manipulation through the collaboration of multi-fingered hands with robotic arms remains a longstanding challenge in robotics, primarily due to the scarcity of high-quality demonstrations and the complexity of high-dimensional action spaces. To address these challenges, we propose FAR-Dex, a hierarchical framework that integrates few-shot data augmentation with adaptive residual refinement to enable robust and precise arm-hand coordination in dexterous tasks. First, FAR-DexGen leverages the IsaacLab simulator to generate diverse and physically constrained trajectories from a few demonstrations, providing a data foundation for policy training. Second, FAR-DexRes introduces an adaptive residual module that refines policies by combining multi-step trajectory segments with observation features, thereby enhancing accuracy and robustness in manipulation scenarios. Experiments in both simulation and real-world demonstrate that FAR-Dex improves data quality by 13.4% and task success rates by 7% over state-of-the-art methods. It further achieves over 80% success in real-world tasks, enabling fine-grained dexterous manipulation with strong positional generalization.
Paper Structure (30 sections, 10 equations, 8 figures, 4 tables)

This paper contains 30 sections, 10 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Overview of the proposed FAR-Dex. (a) FAR-DexGen: demonstration trajectories are decomposed and transformed to generate large-scale synthetic data in the simulator. (b) FAR-DexRes: spatio-temporal residual fine-tuning is applied to four dexterous manipulation tasks. (c) Real-world validation: the trained policy is directly deployed in physical environments.
  • Figure 2: FAR-Dex framework pipeline. (a) Few demonstrations $D_h$ are segmented and augmented via spatial transformations in IsaacLab to form a large-scale dataset $D_g$. (b) The combined data are encoded with a pyramid convolutional network, where a consistency model distills the denoising network of $\pi_{\text{base}}$ to accelerate inference (c) Building on $a_{\text{base}}$ from $\pi_{\text{base}}$, FAR-DexRes integrates multi-step trajectory embedding and observation features to generate adaptive weights $\sigma$ for spatio-temporal residual refinement in dexterous manipulation.
  • Figure 3: Data generation pipeline. Demonstration trajectories are illustrated using a top view, where transformations of the initial object pose $\Delta c_i$ induce corresponding variations across different trajectory segments. These trajectories are then deployed in the simulation environment to collect data.
  • Figure 4: Demonstration Data Collection System in the Real World.
  • Figure 5: Visualization of object interactions in simulation, compared with typical failure cases from existing methods
  • ...and 3 more figures