Object-Oriented Material Classification and 3D Clustering for Improved Semantic Perception and Mapping in Mobile Robots
Siva Krishna Ravipati, Ehsan Latif, Ramviyas Parasuraman, Suchendra M. Bhandarkar
TL;DR
This work tackles the challenge of material-aware perception for mobile robots by integrating RGB-D RGB-D data with SLAM to produce a 3D semantic map that jointly encodes object and material information. The authors propose a three-pillar approach: an object-detection pipeline (YOLOv5) to locate objects, a complementarity-aware fusion network (CAFN) for robust RGB-D material classification, and a voxel-based multiscale clustering framework (VOXM with MSCC) to propagate material labels into the 3D map generated by ORB-SLAM2. Experimental results on public RGB and RGB-D datasets, plus real-world robot deployments, show up to 15% improvement in material classification and 3D clustering accuracy over state-of-the-art baselines, with mean IoU ~0.8 and mAP ~0.65 in real deployments. The practical significance includes richer semantic maps for planning, interaction, and multi-robot collaboration, backed by open-source code and new RGB-D datasets.
Abstract
Classification of different object surface material types can play a significant role in the decision-making algorithms for mobile robots and autonomous vehicles. RGB-based scene-level semantic segmentation has been well-addressed in the literature. However, improving material recognition using the depth modality and its integration with SLAM algorithms for 3D semantic mapping could unlock new potential benefits in the robotics perception pipeline. To this end, we propose a complementarity-aware deep learning approach for RGB-D-based material classification built on top of an object-oriented pipeline. The approach further integrates the ORB-SLAM2 method for 3D scene mapping with multiscale clustering of the detected material semantics in the point cloud map generated by the visual SLAM algorithm. Extensive experimental results with existing public datasets and newly contributed real-world robot datasets demonstrate a significant improvement in material classification and 3D clustering accuracy compared to state-of-the-art approaches for 3D semantic scene mapping.
