3D-ViTac: Learning Fine-Grained Manipulation with Visuo-Tactile Sensing

Binghao Huang; Yixuan Wang; Xinyi Yang; Yiyue Luo; Yunzhu Li

3D-ViTac: Learning Fine-Grained Manipulation with Visuo-Tactile Sensing

Binghao Huang, Yixuan Wang, Xinyi Yang, Yiyue Luo, Yunzhu Li

TL;DR

3D-ViTac tackles the need for integrated visuo-tactile sensing in dexterous robotic manipulation. It introduces dense piezoresistive tactile sensors on soft grippers and a unified 3D visuo-tactile representation that preserves spatial relations between modalities, enabling learning with a diffusion-based policy. The approach yields improved performance over vision-only baselines, particularly under visual occlusion and in long-horizon in-hand tasks, demonstrated on real hardware with low-cost components. This work advances practical, contact-rich manipulation by combining scalable tactile sensing with explicit 3D fusion and diffusion-based imitation learning, and it provides a pathway toward robust, real-world dexterity.

Abstract

Tactile and visual perception are both crucial for humans to perform fine-grained interactions with their environment. Developing similar multi-modal sensing capabilities for robots can significantly enhance and expand their manipulation skills. This paper introduces \textbf{3D-ViTac}, a multi-modal sensing and learning system designed for dexterous bimanual manipulation. Our system features tactile sensors equipped with dense sensing units, each covering an area of 3$mm^2$. These sensors are low-cost and flexible, providing detailed and extensive coverage of physical contacts, effectively complementing visual information. To integrate tactile and visual data, we fuse them into a unified 3D representation space that preserves their 3D structures and spatial relationships. The multi-modal representation can then be coupled with diffusion policies for imitation learning. Through concrete hardware experiments, we demonstrate that even low-cost robots can perform precise manipulations and significantly outperform vision-only policies, particularly in safe interactions with fragile items and executing long-horizon tasks involving in-hand manipulation. Our project page is available at \url{https://binghao-huang.github.io/3D-ViTac/}.

3D-ViTac: Learning Fine-Grained Manipulation with Visuo-Tactile Sensing

TL;DR

Abstract

3D-ViTac: Learning Fine-Grained Manipulation with Visuo-Tactile Sensing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)