Table of Contents
Fetching ...

TransForce: Transferable Force Prediction for Vision-based Tactile Sensors with Sequential Image Translation

Zhuo Chen, Ni Ou, Xuyang Zhang, Shan Luo

TL;DR

This work tackles the problem of transferring force prediction across vision-based tactile sensors (VBTSs) despite domain gaps in illumination and marker patterns. It introduces TransForce, a two-stage framework that first translates tactile images from a source sensor to a target sensor style using a CycleGAN-like translator, then trains a recurrent force predictor on the generated sequences to estimate forces for unseen sensors. The approach leverages sequential visual cues to better capture elastomer deformation and achieves high accuracy in both normal and shear directions, with marker-based modalities excelling in shear and RGB information aiding normal-force estimation. By enabling reuse of existing image-force data across sensors, TransForce offers a practical pathway to fast, low-cost force calibration and scalable tactile sensing for versatile robot manipulation.

Abstract

Vision-based tactile sensors (VBTSs) provide high-resolution tactile images crucial for robot in-hand manipulation. However, force sensing in VBTSs is underutilized due to the costly and time-intensive process of acquiring paired tactile images and force labels. In this study, we introduce a transferable force prediction model, TransForce, designed to leverage collected image-force paired data for new sensors under varying illumination colors and marker patterns while improving the accuracy of predicted forces, especially in the shear direction. Our model effectively achieves translation of tactile images from the source domain to the target domain, ensuring that the generated tactile images reflect the illumination colors and marker patterns of the new sensors while accurately aligning the elastomer deformation observed in existing sensors, which is beneficial to force prediction of new sensors. As such, a recurrent force prediction model trained with generated sequential tactile images and existing force labels is employed to estimate higher-accuracy forces for new sensors with lowest average errors of 0.69N (5.8\% in full work range) in $x$-axis, 0.70N (5.8\%) in $y$-axis, and 1.11N (6.9\%) in $z$-axis compared with models trained with single images. The experimental results also reveal that pure marker modality is more helpful than the RGB modality in improving the accuracy of force in the shear direction, while the RGB modality show better performance in the normal direction.

TransForce: Transferable Force Prediction for Vision-based Tactile Sensors with Sequential Image Translation

TL;DR

This work tackles the problem of transferring force prediction across vision-based tactile sensors (VBTSs) despite domain gaps in illumination and marker patterns. It introduces TransForce, a two-stage framework that first translates tactile images from a source sensor to a target sensor style using a CycleGAN-like translator, then trains a recurrent force predictor on the generated sequences to estimate forces for unseen sensors. The approach leverages sequential visual cues to better capture elastomer deformation and achieves high accuracy in both normal and shear directions, with marker-based modalities excelling in shear and RGB information aiding normal-force estimation. By enabling reuse of existing image-force data across sensors, TransForce offers a practical pathway to fast, low-cost force calibration and scalable tactile sensing for versatile robot manipulation.

Abstract

Vision-based tactile sensors (VBTSs) provide high-resolution tactile images crucial for robot in-hand manipulation. However, force sensing in VBTSs is underutilized due to the costly and time-intensive process of acquiring paired tactile images and force labels. In this study, we introduce a transferable force prediction model, TransForce, designed to leverage collected image-force paired data for new sensors under varying illumination colors and marker patterns while improving the accuracy of predicted forces, especially in the shear direction. Our model effectively achieves translation of tactile images from the source domain to the target domain, ensuring that the generated tactile images reflect the illumination colors and marker patterns of the new sensors while accurately aligning the elastomer deformation observed in existing sensors, which is beneficial to force prediction of new sensors. As such, a recurrent force prediction model trained with generated sequential tactile images and existing force labels is employed to estimate higher-accuracy forces for new sensors with lowest average errors of 0.69N (5.8\% in full work range) in -axis, 0.70N (5.8\%) in -axis, and 1.11N (6.9\%) in -axis compared with models trained with single images. The experimental results also reveal that pure marker modality is more helpful than the RGB modality in improving the accuracy of force in the shear direction, while the RGB modality show better performance in the normal direction.
Paper Structure (16 sections, 6 equations, 6 figures, 3 tables)

This paper contains 16 sections, 6 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Transferable force prediction model for VBTSs. The image translation model takes tactile images $\mathbf{I}_s$ from the source domain as inputs and generates tactile images $\hat{\mathbf{I}}_t$, which share similar illumination colors and marker patterns with the target domain $\mathbf{I}_t$ while aligning deformation and contact shape with the source domain $\mathbf{I}_s$. After (I) training with sequential tactile images $\hat{\mathbf{I}}_t$ and force labels $\mathbf{F}_s$, (II) the force prediction model $\hat{\phi}$ is able to infer forces $\mathbf{\hat{F}}_t$ with sequential tactile images $\mathbf{I}_t$.
  • Figure 2: Pipeline of the TransForce model. (a-b) Image translation process for (a) translating the tactile image $\mathbf{I}_s$ to $\hat{\mathbf{I}}_t$ by (b) training a generative model with generator $G_s$ mapping images from $\mathcal{S} \rightarrow \mathcal{T}$ and generator $G_t$ mapping images from $\mathcal{T} \rightarrow \mathcal{S}$. The discriminator $D_t$ aims to discriminate $\hat{\mathbf{I}}_t$ from $\mathbf{I}_t$ while $D_s$ aims to discriminate $\hat{\mathbf{I}}_s$ from $\mathbf{I}_s$. (c) Sequential force prediction model. The model is trained with sequential generated images $\hat{\mathbf{I}}^1_t \sim \hat{\mathbf{I}}^T_t$ and force labels $\mathbf{F}^1_s \sim \mathbf{F}^T_s$, while predicts forces $\hat{\mathbf{F}}^1_t \sim \hat{\mathbf{F}}^T_t$ by taking $\mathbf{I}^1_t \sim \mathbf{I}^T_t$ as input.
  • Figure 3: (a) Real-world setup for data collection (b) 3D-printed indenters. (c) Contact path for applying normal force and shear force.
  • Figure 4: (a-b) Visualization of tactile image translation with selected 6 types of 3D-printed indenters from $seen$ group and $unseen$ group. (b) Image translation results for models with and without (w/o) identity loss.
  • Figure 5: Force prediction performance using source-only method ($top$ row in each panel, denoted as $\eta(\cdot)$) and TransForce model ($bottom$ row, denoted as $\phi(\cdot)$) with three types of tactile images, including (a-b) RGB images with markers $rm$, (c) RGB images without markers $r$, (d) marker-only tactile images $m$. Note that (a) is tested with the model without LSTM (with supersript $o$) while (b-d) are with LSTM.
  • ...and 1 more figures