Table of Contents
Fetching ...

Touch-to-Touch Translation -- Learning the Mapping Between Heterogeneous Tactile Sensing Technologies

Francesco Grella, Alessandro Albini, Giorgio Cannata, Perla Maiolino

TL;DR

This paper considers the problem of learning the mapping between two tactile sensor outputs with respect to the same physical stimulus as touch-to-touch translation and proposed two data-driven approaches to address this task and compared their performance.

Abstract

The use of data-driven techniques for tactile data processing and classification has recently increased. However, collecting tactile data is a time-expensive and sensor-specific procedure. Indeed, due to the lack of hardware standards in tactile sensing, data is required to be collected for each different sensor. This paper considers the problem of learning the mapping between two tactile sensor outputs with respect to the same physical stimulus -- we refer to this problem as touch-to-touch translation. In this respect, we proposed two data-driven approaches to address this task and we compared their performance. The first one exploits a generative model developed for image-to-image translation and adapted for this context. The second one uses a ResNet model trained to perform a regression task. We validated both methods using two completely different tactile sensors -- a camera-based, Digit and a capacitance-based, CySkin. In particular, we used Digit images to generate the corresponding CySkin data. We trained the models on a set of tactile features that can be found in common larger objects and we performed the testing on a previously unseen set of data. Experimental results show the possibility of translating Digit images into the CySkin output by preserving the contact shape and with an error of 15.18% in the magnitude of the sensor responses.

Touch-to-Touch Translation -- Learning the Mapping Between Heterogeneous Tactile Sensing Technologies

TL;DR

This paper considers the problem of learning the mapping between two tactile sensor outputs with respect to the same physical stimulus as touch-to-touch translation and proposed two data-driven approaches to address this task and compared their performance.

Abstract

The use of data-driven techniques for tactile data processing and classification has recently increased. However, collecting tactile data is a time-expensive and sensor-specific procedure. Indeed, due to the lack of hardware standards in tactile sensing, data is required to be collected for each different sensor. This paper considers the problem of learning the mapping between two tactile sensor outputs with respect to the same physical stimulus -- we refer to this problem as touch-to-touch translation. In this respect, we proposed two data-driven approaches to address this task and we compared their performance. The first one exploits a generative model developed for image-to-image translation and adapted for this context. The second one uses a ResNet model trained to perform a regression task. We validated both methods using two completely different tactile sensors -- a camera-based, Digit and a capacitance-based, CySkin. In particular, we used Digit images to generate the corresponding CySkin data. We trained the models on a set of tactile features that can be found in common larger objects and we performed the testing on a previously unseen set of data. Experimental results show the possibility of translating Digit images into the CySkin output by preserving the contact shape and with an error of 15.18% in the magnitude of the sensor responses.

Paper Structure

This paper contains 12 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: touch2touch architecture and training pipeline. (1) The network is trained by collecting paired tactile samples corresponding to the same physical stimulus; (2) Camera-based sensor and its output; (3) Taxel-based tactile sensor. The red circle highlight a single taxel which measurement is contained in the output array $Y$; (4) The output $Y$ is converted into an image $I$ and input to the pix2pix model; (5) The pix2pix model and its two loss functions as described in pix2pix. G and D represent the generator and the discriminator; (6) The $\mathcal{L}_3$ loss added to properly stop the model training; (7) The output of touch2touch, $\hat{Y}$ corresponding to $\hat{I}$ converted into an array.
  • Figure 2: The regression approach based on ResNet18 model. The softmax layer has been replaced with a fully connected layer of size $N$, corresponding to the number of taxels. The system is trained using the $\mathcal{L}_3$ loss. The shortcut connections of ResNet18 among layers have not been represented for simplicity.
  • Figure 3: Tactile primitives used in this paper. The picture shows one for each of the 8 types. We 3D printed each one of them at 4 different scales, leading to a total of 32 primitives. (a) Line with smooth edges. (b) Square. (c) Empty Circle. (d) Circle. (e) Bump; (f) Empty square. (g) Hemisphere. (h) Line with sharp edges.
  • Figure 4: Objects used as a test set. The keypoints, corresponding to the sampling positions are marked with light blue squares on the objects. (a) Pliers. (b) Clamp. (c) Scissors. (d) Allen key. (e) Wrench.
  • Figure 5: The two tactile sensors used in this paper are integrated on a plastic mount that can be connected to the Panda Robot flange. (a) Digit provides an output RGB image of $320 \times 240$ pixels. The white soft layer on top is made on Solaris with Shore 15. (b) CySkin is a capacitive-based tactile sensor. The soft layer covering the sensor is made of Ecoflex Shore 10. In this picture, the soft layer has been removed to show the distribution of the tactile elements (the green circles). The pitch among them is 7.5mm. The sensor output corresponds to an array of 20 measurements.
  • ...and 2 more figures