Table of Contents
Fetching ...

Toward Artificial Palpation: Representation Learning of Touch on Soft Bodies

Zohar Rimon, Elisei Shafer, Tal Tepper, Efrat Shimron, Aviv Tamar

TL;DR

Toward Artificial Palpation presents a self-supervised representation learning framework that converts sequences of tactile measurements into a single latent representation $z_T$ capturing soft-object structure. An encoder–decoder acts as a forward model for forces, while a downstream flow-matching predictor maps the representation to MRI-ground-truth images, enabling tactile imaging and change detection. The approach is validated in PalpationSim and on real modular breast phantoms, showing lump position/size information can be recovered more reliably from learned representations than from raw force maps, with data volume and simple augmentations driving gains. The work outlines a promising path toward foundation models for touch, outlines data requirements for clinical translation, and discusses limitations and future directions for real-world deployment.

Abstract

Palpation, the use of touch in medical examination, is almost exclusively performed by humans. We investigate a proof of concept for an artificial palpation method based on self-supervised learning. Our key idea is that an encoder-decoder framework can learn a $\textit{representation}$ from a sequence of tactile measurements that contains all the relevant information about the palpated object. We conjecture that such a representation can be used for downstream tasks such as tactile imaging and change detection. With enough training data, it should capture intricate patterns in the tactile measurements that go beyond a simple map of forces -- the current state of the art. To validate our approach, we both develop a simulation environment and collect a real-world dataset of soft objects and corresponding ground truth images obtained by magnetic resonance imaging (MRI). We collect palpation sequences using a robot equipped with a tactile sensor, and train a model that predicts sensory readings at different positions on the object. We investigate the representation learned in this process, and demonstrate its use in imaging and change detection.

Toward Artificial Palpation: Representation Learning of Touch on Soft Bodies

TL;DR

Toward Artificial Palpation presents a self-supervised representation learning framework that converts sequences of tactile measurements into a single latent representation capturing soft-object structure. An encoder–decoder acts as a forward model for forces, while a downstream flow-matching predictor maps the representation to MRI-ground-truth images, enabling tactile imaging and change detection. The approach is validated in PalpationSim and on real modular breast phantoms, showing lump position/size information can be recovered more reliably from learned representations than from raw force maps, with data volume and simple augmentations driving gains. The work outlines a promising path toward foundation models for touch, outlines data requirements for clinical translation, and discusses limitations and future directions for real-world deployment.

Abstract

Palpation, the use of touch in medical examination, is almost exclusively performed by humans. We investigate a proof of concept for an artificial palpation method based on self-supervised learning. Our key idea is that an encoder-decoder framework can learn a from a sequence of tactile measurements that contains all the relevant information about the palpated object. We conjecture that such a representation can be used for downstream tasks such as tactile imaging and change detection. With enough training data, it should capture intricate patterns in the tactile measurements that go beyond a simple map of forces -- the current state of the art. To validate our approach, we both develop a simulation environment and collect a real-world dataset of soft objects and corresponding ground truth images obtained by magnetic resonance imaging (MRI). We collect palpation sequences using a robot equipped with a tactile sensor, and train a model that predicts sensory readings at different positions on the object. We investigate the representation learned in this process, and demonstrate its use in imaging and change detection.

Paper Structure

This paper contains 45 sections, 3 equations, 25 figures, 5 tables, 1 algorithm.

Figures (25)

  • Figure 1: Proof of concept system for learning artificial breast palpation. Left: we fabricate soft objects and palpate them using a tactile sensor mounted on a robot arm. We also obtain MRI scans of the objects as ground truth object models. Middle: we train an encoder-decoder neural network to predict the tactile measurements at given positions from a sequence of previous measurements. Right: we use the learned representation to train a model for tactile imaging, and perform change detection based on predicted images. In principle, by replacing the phantoms with human subjects, our system can be used for clinical studies.
  • Figure 2: Representation Learning: (a) A sequence of tactile measurements and poses is encoded by first encoding every measurement+pose by a force-location encoder (FLE), and then encoding the sequence by a GRU. (b) The decoder predicts a tactile measurement at time $t'_{k'}$ from the representation at time $t_k$ and the pose at time $t'_{k'}$.
  • Figure 3: PalpationSim simulator and simulation results. (a) A $2$-dimensional finite-element model of a round sensor pressing on a soft object with a harder lump inside; insert shows the forces on the sensor. (b+c) A ground truth image of the body (b), and a predicted image (c). (d-f) Image prediction results: (d) with and without self-supervised pretraining, (e) with and without permutation augmentation, (f) with different number of trajectories per trial. See text for details.
  • Figure 4: Modular Breast Phantom Design. (a) The insert has an octagonal 3D-printed base, a soft-silicone skin, and is filled with polyvinyl acetate hydrogel. A soft silicone "lump" is embedded within and attached to the base. (b) The shell has two layers of soft silicone skin, with hydrogel in between, where the bottom layer is attached to a 3D-printed base with an octagonal hole. The insert can be positioned in $8$ possible orientations inside the shell. (c) An assembled shell+insert and a standalone insert. The bar-code labels allow to automatically record the component types and orientations using an overhead camera.
  • Figure 5: Tactile imaging with real data. Columns show: (a) 3D CAD design, (b) Ground-truth MRI image slice, (c) Predicted images using our method, (d) Force-map visualization.
  • ...and 20 more figures