Monocular Reconstruction of Neural Tactile Fields
Pavan Mantripragada, Siddhanth Deshmukh, Eadom Dessalene, Manas Desai, Yiannis Aloimonos
TL;DR
This work addresses the need for interaction-aware 3D scene representations by introducing neural tactile fields, a dense 3D map from location to predicted tactile response that can be inferred from a single monocular RGB image. The method extends a Large Reconstruction Model (LRM) with a finetuned triplane decoder to jointly predict geometry and a 3D tactile field S(x), supervised by a new high-resolution visuotactile dataset acquired with GelSight pressure measurements. The authors demonstrate improved volumetric and surface reconstruction over state-of-the-art monocular methods and show that predicted tactile fields enable interaction-aware planning in which paths navigate deformable regions while avoiding rigid obstacles. This approach provides a practical pathway to safety and efficiency in planning under contact-rich conditions, such as agricultural robotics, by forecasting how objects will resist or yield to contact without online physical interaction. The work also contributes a dataset and a scalable training framework that can initialize more advanced physically grounded reconstructions.
Abstract
Robots operating in the real world must plan through environments that deform, yield, and reconfigure under contact, requiring interaction-aware 3D representations that extend beyond static geometric occupancy. To address this, we introduce neural tactile fields, a novel 3D representation that maps spatial locations to the expected tactile response upon contact. Our model predicts these neural tactile fields from a single monocular RGB image -- the first method to do so. When integrated with off-the-shelf path planners, neural tactile fields enable robots to generate paths that avoid high-resistance objects while deliberately routing through low-resistance regions (e.g. foliage), rather than treating all occupied space as equally impassable. Empirically, our learning framework improves volumetric 3D reconstruction by $85.8\%$ and surface reconstruction by $26.7\%$ compared to state-of-the-art monocular 3D reconstruction methods (LRM and Direct3D).
