ViTacTip: Design and Verification of a Novel Biomimetic Physical Vision-Tactile Fusion Sensor

Wen Fan; Haoran Li; Weiyong Si; Shan Luo; Nathan Lepora; Dandan Zhang

ViTacTip: Design and Verification of a Novel Biomimetic Physical Vision-Tactile Fusion Sensor

Wen Fan, Haoran Li, Weiyong Si, Shan Luo, Nathan Lepora, Dandan Zhang

TL;DR

ViTacTip presents a biomimetic, vision-tactile fusion sensor with a see-through-skin that enables concurrent visual and tactile data capture in a compact form. A GAN-based modality-switching framework disentangles visual and tactile information, allowing ViTacTip to emulate TacTip or ViTac behavior as needed. Across grating identification, edge pose regression, and contact localization with force estimation, ViTacTip outperforms single-modality baselines, achieving $99.72$% grating accuracy and improved localization and force predictions under varying lighting. The approach demonstrates improved robustness and practical potential for integrated perception in dexterous manipulation tasks, with substantial performance gains over state-of-the-art single-modality sensors.

Abstract

Tactile sensing is significant for robotics since it can obtain physical contact information during manipulation. To capture multimodal contact information within a compact framework, we designed a novel sensor called ViTacTip, which seamlessly integrates both tactile and visual perception capabilities into a single, integrated sensor unit. ViTacTip features a transparent skin to capture fine features of objects during contact, which can be known as the see-through-skin mechanism. In the meantime, the biomimetic tips embedded in ViTacTip can amplify touch motions during tactile perception. For comparative analysis, we also fabricated a ViTac sensor devoid of biomimetic tips, as well as a TacTip sensor with opaque skin. Furthermore, we develop a Generative Adversarial Network (GAN)-based approach for modality switching between different perception modes, effectively alternating the emphasis between vision and tactile perception modes. We conducted a performance evaluation of the proposed sensor across three distinct tasks: i) grating identification, ii) pose regression, and iii) contact localization and force estimation. In the grating identification task, ViTacTip demonstrated an accuracy of 99.72%, surpassing TacTip, which achieved 94.60%. It also exhibited superior performance in both pose and force estimation tasks with the minimum error of 0.08mm and 0.03N, respectively, in contrast to ViTac's 0.12mm and 0.15N. Results indicate that ViTacTip outperforms single-modality sensors.

ViTacTip: Design and Verification of a Novel Biomimetic Physical Vision-Tactile Fusion Sensor

TL;DR

% grating accuracy and improved localization and force predictions under varying lighting. The approach demonstrates improved robustness and practical potential for integrated perception in dexterous manipulation tasks, with substantial performance gains over state-of-the-art single-modality sensors.

Abstract

Paper Structure (12 sections, 6 figures)

This paper contains 12 sections, 6 figures.

Introduction
Related Work
Design and Fabrication
ViTacTip Design Principles
ViTacTip Fabrication
Vision-Tactile Fusion Imaging Principle
Modality Conversion Principle
Experiment and Results
Grating Identification
Pose Regression
Contact Localization and Force Estimation
Conclusions and Future Work

Figures (6)

Figure 1: A: ViTacTip schematic which achieves an internal fusion of vision and tactile modalities through the structure of bio-inspired pins and transparent skin. B: Architecture of ViTacTip sensor, which illustrates exploded view of sub-parts.
Figure 2: Modality conversion framework. Two independent GAN models are trained with three datasets from ViTac, ViTacTip, and TacTip separately to achieve marker-removing tasks and light-removing tasks.
Figure 3: Overview of three experiments designed for ViTacTip, ViTac, and TacTip.
Figure 4: Evaluation result of TacTip, ViTac, and ViTacTip on grating identification.
Figure 5: Evaluation result of TacTip, ViTac, and ViTacTip on edge pose regression for horizontal distance $X$, press depth $Z$, rotation angle $\theta$. The red line indicates the mean fit of regression values, the smaller the deviation of this line from the diagonal $y=x$, the better the prediction is proved to be.
...and 1 more figures

ViTacTip: Design and Verification of a Novel Biomimetic Physical Vision-Tactile Fusion Sensor

TL;DR

Abstract

ViTacTip: Design and Verification of a Novel Biomimetic Physical Vision-Tactile Fusion Sensor

Authors

TL;DR

Abstract

Table of Contents

Figures (6)