Table of Contents
Fetching ...

MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation

Jinda Du, Jieji Ren, Qiaojun Yu, Ningbin Zhang, Yu Deng, Xingyu Wei, Yufei Liu, Guoying Gu, Xiangyang Zhu

TL;DR

<3-5 sentence high-level summary> This paper introduces MILE, a mechanically isomorphic exoskeleton–robot hand system with fingertip visuotactile sensing designed to collect high-fidelity demonstrations for dexterous manipulation. By ensuring one-to-one joint correspondence and embedding compact Tac-Tip sensors, the system eliminates retargeting distortions and achieves sub-degree joint sensing, enabling precise teleoperation and rich multimodal data capture. The authors demonstrate substantial gains in teleoperation success and improved robustness of imitation-learning policies when tactile data are included, validated on several contact-rich tasks. Overall, MILE provides a scalable data-collection pipeline that advances learning-based dexterous manipulation through high-quality vision–tactile multimodal demonstrations.

Abstract

Imitation learning provides a promising approach to dexterous hand manipulation, but its effectiveness is limited by the lack of large-scale, high-fidelity data. Existing data-collection pipelines suffer from inaccurate motion retargeting, low data-collection efficiency, and missing high-resolution fingertip tactile sensing. We address this gap with MILE, a mechanically isomorphic teleoperation and data-collection system co-designed from human hand to exoskeleton to robotic hand. The exoskeleton is anthropometrically derived from the human hand, and the robotic hand preserves one-to-one joint-position isomorphism, eliminating nonlinear retargeting and enabling precise, natural control. The exoskeleton achieves a multi-joint mean absolute angular error below one degree, while the robotic hand integrates compact fingertip visuotactile modules that provide high-resolution tactile observations. Built on this retargeting-free interface, we teleoperate complex, contact-rich in-hand manipulation and efficiently collect a multimodal dataset comprising high-resolution fingertip visuotactile signals, RGB-D images, and joint positions. The teleoperation pipeline achieves a mean success rate improvement of 64%. Incorporating fingertip tactile observations further increases the success rate by an average of 25% over the vision-only baseline, validating the fidelity and utility of the dataset. Further details are available at: https://sites.google.com/view/mile-system.

MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation

TL;DR

<3-5 sentence high-level summary> This paper introduces MILE, a mechanically isomorphic exoskeleton–robot hand system with fingertip visuotactile sensing designed to collect high-fidelity demonstrations for dexterous manipulation. By ensuring one-to-one joint correspondence and embedding compact Tac-Tip sensors, the system eliminates retargeting distortions and achieves sub-degree joint sensing, enabling precise teleoperation and rich multimodal data capture. The authors demonstrate substantial gains in teleoperation success and improved robustness of imitation-learning policies when tactile data are included, validated on several contact-rich tasks. Overall, MILE provides a scalable data-collection pipeline that advances learning-based dexterous manipulation through high-quality vision–tactile multimodal demonstrations.

Abstract

Imitation learning provides a promising approach to dexterous hand manipulation, but its effectiveness is limited by the lack of large-scale, high-fidelity data. Existing data-collection pipelines suffer from inaccurate motion retargeting, low data-collection efficiency, and missing high-resolution fingertip tactile sensing. We address this gap with MILE, a mechanically isomorphic teleoperation and data-collection system co-designed from human hand to exoskeleton to robotic hand. The exoskeleton is anthropometrically derived from the human hand, and the robotic hand preserves one-to-one joint-position isomorphism, eliminating nonlinear retargeting and enabling precise, natural control. The exoskeleton achieves a multi-joint mean absolute angular error below one degree, while the robotic hand integrates compact fingertip visuotactile modules that provide high-resolution tactile observations. Built on this retargeting-free interface, we teleoperate complex, contact-rich in-hand manipulation and efficiently collect a multimodal dataset comprising high-resolution fingertip visuotactile signals, RGB-D images, and joint positions. The teleoperation pipeline achieves a mean success rate improvement of 64%. Incorporating fingertip tactile observations further increases the success rate by an average of 25% over the vision-only baseline, validating the fidelity and utility of the dataset. Further details are available at: https://sites.google.com/view/mile-system.

Paper Structure

This paper contains 26 sections, 6 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Overview of MILE data collection system. The system integrates fingertip visuotactile sensing with a mechanically isomorphic MILE exoskeleton to collect dexterous hand demonstrations. It achieves sub-degree joint accuracy, enabling complex, contact-rich in-hand manipulation. A modular, low-cost tactile sensor is compact and can be integrated into system and provides high-resolution contact measurements. Policies trained on the collected data with visuotactile inputs outperform vision-only baselines on contact-rich manipulation tasks, indicating improved robustness and inference quality.
  • Figure 2: Size relationship among the human hand, the MILE exoskeleton, and the MILE-Tac hand: the human hand is close in scale to the exoskeleton, whereas the exoskeleton and the dexterous hand are kinematically isomorphic, with a scale ratio of 5:9.
  • Figure 3: Exploded views of the assembly and key components. (a) Overall view of the MILE exoskeleton: 5-DoF thumb and 4-DoF index, middle, and ring fingers. (b) Detail of the fingertip joint. (c) Overall view of the MILE-Tac hand with a Tac-Tip on each finger. (d) Exploded view of the Tac-Tip visuotactile sensor.
  • Figure 4: MoCap setup and marker layouts. (a) Camera arrangement with Manus glove markers. (b) Single-joint precision test. (c) MILE exoskeleton. (d) 5DT glove. (e) Teleoperation precision test.
  • Figure 5: The single encoder precision test with MI(Supplementary Video 1).
  • ...and 7 more figures