MARS: Multimodal Active Robotic Sensing for Articulated Characterization
Hongliang Zeng, Ping Zhang, Chengjiong Wu, Jiahua Wang, Tingyu Ye, Fang Li
TL;DR
MARS addresses robust articulation parameter estimation for articulated objects by integrating multimodal RGB and point-cloud data with a reinforcement learning based active sensing strategy. It introduces Multimodal Feature Fusion Perception (MFFP) and Active Sensing (AS), augmented by MLDM for adaptive RGB feature aggregation and a transformer-based fusion block. The system predicts joint parameters for revolute and prismatic joints and provides a perception score to gauge viewpoint quality; active sensing improves viewpoint selection and estimation accuracy, with real-world experiments showing practical applicability. Overall, MARS achieves state-of-the-art joint parameter estimation on PartNet-Mobility, demonstrating improved robustness under suboptimal viewpoints and enabling effective command-based manipulation in real-world scenarios.
Abstract
Precise perception of articulated objects is vital for empowering service robots. Recent studies mainly focus on point cloud, a single-modal approach, often neglecting vital texture and lighting details and assuming ideal conditions like optimal viewpoints, unrepresentative of real-world scenarios. To address these limitations, we introduce MARS, a novel framework for articulated object characterization. It features a multi-modal fusion module utilizing multi-scale RGB features to enhance point cloud features, coupled with reinforcement learning-based active sensing for autonomous optimization of observation viewpoints. In experiments conducted with various articulated object instances from the PartNet-Mobility dataset, our method outperformed current state-of-the-art methods in joint parameter estimation accuracy. Additionally, through active sensing, MARS further reduces errors, demonstrating enhanced efficiency in handling suboptimal viewpoints. Furthermore, our method effectively generalizes to real-world articulated objects, enhancing robot interactions. Code is available at https://github.com/robhlzeng/MARS.
