Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors

Gabriele M. Caddeo; Andrea Maracani; Paolo D. Alfano; Nicola A. Piga; Lorenzo Rosasco; Lorenzo Natale

Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors

Gabriele M. Caddeo, Andrea Maracani, Paolo D. Alfano, Nicola A. Piga, Lorenzo Rosasco, Lorenzo Natale

TL;DR

This work tackles the Sim2Real gap in vision-based tactile surface classification by coupling a diffusion-model image translator, trained on a small unlabeled real DIGIT dataset, with a diffusion-driven translation of simulated data and a Domain-Adversarial Training of Neural Networks (DANN) framework for feature alignment. Surfaces are automatically labeled from object meshes into four categories (flat, curve, edge, corner) via a curvature-based metric, enabling automatic, annotation-free training on labeled simulated data that is translated to the real domain. The resulting classifier achieves an 81.9% overall accuracy on real tactile data, a substantial improvement over 34.7% when trained on simulated data alone, and also enhances 6D object pose estimation from tactile cues. The approach demonstrates dataset-efficient, cross-domain transfer and practical applicability to tactile-driven perception tasks.

Abstract

In this paper, we address the Sim2Real gap in the field of vision-based tactile sensors for classifying object surfaces. We train a Diffusion Model to bridge this gap using a relatively small dataset of real-world images randomly collected from unlabeled everyday objects via the DIGIT sensor. Subsequently, we employ a simulator to generate images by uniformly sampling the surface of objects from the YCB Model Set. These simulated images are then translated into the real domain using the Diffusion Model and automatically labeled to train a classifier. During this training, we further align features of the two domains using an adversarial procedure. Our evaluation is conducted on a dataset of tactile images obtained from a set of ten 3D printed YCB objects. The results reveal a total accuracy of 81.9%, a significant improvement compared to the 34.7% achieved by the classifier trained solely on simulated images. This demonstrates the effectiveness of our approach. We further validate our approach using the classifier on a 6D object pose estimation task from tactile data.

Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors

TL;DR

Abstract

Paper Structure (12 sections, 6 equations, 7 figures, 3 tables)

This paper contains 12 sections, 6 equations, 7 figures, 3 tables.

INTRODUCTION
RELATED WORK
METHOD
Acquisition and labeling of simulated data
Image-level adaptation
Feature-level adaptation
Training and testing datasets
EXPERIMENTAL RESULTS
Experiments on surface classification
Experiments on 6D object pose estimation
Limitations
Conclusion

Figures (7)

Figure 1: Our pipeline uses a Diffusion Model to translate simulated images towards the real domain so as to reduce the Sim2Real gap.
Figure 2: Overview of our pipeline for object surface classification.
Figure 3: On the left: curvature levels (Eq. \ref{['eq:curvature']}) for the YCB objects "mustard bottle" and "sugar box". On the right: the resulting labels of surface types.
Figure 4: Diagram of the classification architecture.
Figure 5: On the left: the testing setup with the DIGIT sensor touching the "bleach cleanser" object. On the right: example of YCB objects and their 3D printed counterparts.
...and 2 more figures

Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors

TL;DR

Abstract

Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors

Authors

TL;DR

Abstract

Table of Contents

Figures (7)