PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation
Adrian Röfer, Nick Heppert, Abdallah Ayad, Eugenio Chisari, Abhinav Valada
TL;DR
This work introduces PseudoTouch, a light-weight framework that infers tactile readings from small depth patches to create a high-signal visual-tactile embedding. The approach maps depth inputs $z_d \in \mathbb{R}^{17\times17}$ to tactile outputs $\tilde{\tau} \in \mathbb{R}^{15}$ using a compact neural network, trained on data from eight primitive shapes and extended to everyday objects. It validates the utility of the embedding on object recognition (achieving $84\%$ accuracy after ten touches on everyday items) and grasp stability prediction, where tactile-derived predictions substantially outperform baselines relying on partial point clouds, including in sim2real settings. The paper also demonstrates data-efficient training via simulated depth patches and releases data, code, and models to facilitate adoption in robotics research and applications.
Abstract
Tactile sensing is vital for human dexterous manipulation, however, it has not been widely used in robotics. Compact, low-cost sensing platforms can facilitate a change, but unlike their popular optical counterparts, they are difficult to deploy in high-fidelity tasks due to their low signal dimensionality and lack of a simulation model. To overcome these challenges, we introduce PseudoTouch which links high-dimensional structural information to low-dimensional sensor signals. It does so by learning a low-dimensional visual-tactile embedding, wherein we encode a depth patch from which we decode the tactile signal. We collect and train PseudoTouch on a dataset comprising aligned tactile and visual data pairs obtained through random touching of eight basic geometric shapes. We demonstrate the utility of our trained PseudoTouch model in two downstream tasks: object recognition and grasp stability prediction. In the object recognition task, we evaluate the learned embedding's performance on a set of five basic geometric shapes and five household objects. Using PseudoTouch, we achieve an object recognition accuracy 84% after just ten touches, surpassing a proprioception baseline. For the grasp stability task, we use ACRONYM labels to train and evaluate a grasp success predictor using PseudoTouch's predictions derived from virtual depth information. Our approach yields a 32% absolute improvement in accuracy compared to the baseline relying on partial point cloud data. We make the data, code, and trained models publicly available at https://pseudotouch.cs.uni-freiburg.de.
