Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking
Xi Wang, Tianxing Chen, Qiaojun Yu, Tianling Xu, Zanxin Chen, Yiting Fu, Ziqi He, Cewu Lu, Yao Mu, Ping Luo
TL;DR
This paper addresses articulated object manipulation by online axis estimation integrated with SAM2-based tracking. It proposes a closed-loop pipeline that uses interactive perception to induce motion, SAM2 for segmentation, and online axis estimation from moving-part point clouds to guide manipulation. The method defines axis types (prismatic and revolute) and refines axis estimates over time via a sliding window, improving precision and robustness over open-loop baselines. Experimental results in simulation and real-world deployment show significant improvements in axis-based manipulation tasks such as door and drawer opening, demonstrating practical applicability and generalization. This approach advances perception-action coupling for articulated object manipulation.
Abstract
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered. Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics. To address this limitation, we present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds. Our method leverages any interactive perception technique as a foundation for interactive perception, inducing slight object movement to generate point cloud frames of the evolving dynamic scene. These point clouds are then segmented using Segment Anything Model 2 (SAM2), after which the moving part of the object is masked for accurate motion online axis estimation, guiding subsequent robotic actions. Our approach significantly enhances the precision and efficiency of manipulation tasks involving articulated objects. Experiments in simulated environments demonstrate that our method outperforms baseline approaches, especially in tasks that demand precise axis-based control. Project Page: https://hytidel.github.io/video-tracking-for-axis-estimation/.
