Cross-Sensor Touch Generation
Samanta Rodriguez, Yiming Dou, Miquel Oller, Andrew Owens, Nima Fazeli
TL;DR
This work tackles the challenge of diverse visuo-tactile sensors by proposing cross-sensor tactile generation methods that enable models trained on one sensor to operate on others. It introduces two pipelines: a one-stage, paired-data diffusion approach called Touch-to-Touch (T2T) and a two-stage, depth-mediated approach called Touch-to-Depth-to-Touch (T2D2) that works with unpaired data. The authors validate these methods on in-hand pose estimation and behavior cloning tasks, demonstrating successful transfer across Soft Bubble, GelSlim, and DIGIT sensors, with T2T delivering higher fidelity and T2D2 offering greater data-efficiency for adding new sensors. The results reveal a trade-off between fidelity and data flexibility, highlighting the potential of sensor-interoperable tactile systems for reusable downstream perception and control pipelines.
Abstract
Today's visuo-tactile sensors come in many shapes and sizes, making it challenging to develop general-purpose tactile representations. This is because most models are tied to a specific sensor design. To address this challenge, we propose two approaches to cross-sensor image generation. The first is an end-to-end method that leverages paired data (Touch2Touch). The second method builds an intermediate depth representation and does not require paired data (T2D2: Touch-to-Depth-to-Touch). Both methods enable the use of sensor-specific models across multiple sensors via the cross-sensor touch generation process. Together, these models offer flexible solutions for sensor translation, depending on data availability and application needs. We demonstrate their effectiveness on downstream tasks such as in-hand pose estimation and behavior cloning, successfully transferring models trained on one sensor to another. Project page: https://samantabelen.github.io/cross_sensor_touch_generation.
