SemanticFeels: Semantic Labeling during In-Hand Manipulation
Anas Al Shikh Khalil, Haozhi Qi, Roberto Calandra
TL;DR
SemanticFeels extends NeuralFeels to enable semantic labeling of materials during in-hand robot manipulation by fusing tactile-based material predictions with a neural implicit surface representation. The method uses Digit tactile images processed by EfficientNet-B0 to classify local materials, which are embedded into an augmented neural SDF to jointly predict geometry and material regions. A 20,749-sample tactile dataset across four materials supports offline training, and real-time experiments demonstrate high per-sensor accuracy and a 79.87% average material-map matching on multi-material objects. The work advances tactile-anchored semantic understanding in dexterous manipulation, enabling more adaptive and robust manipulation policies under material variation.
Abstract
As robots become increasingly integrated into everyday tasks, their ability to perceive both the shape and properties of objects during in-hand manipulation becomes critical for adaptive and intelligent behavior. We present SemanticFeels, an extension of the NeuralFeels framework that integrates semantic labeling with neural implicit shape representation, from vision and touch. To illustrate its application, we focus on material classification: high-resolution Digit tactile readings are processed by a fine-tuned EfficientNet-B0 convolutional neural network (CNN) to generate local material predictions, which are then embedded into an augmented signed distance field (SDF) network that jointly predicts geometry and continuous material regions. Experimental results show that the system achieves a high correspondence between predicted and actual materials on both single- and multi-material objects, with an average matching accuracy of 79.87% across multiple manipulation trials on a multi-material object.
