Table of Contents
Fetching ...

Exploring Category-level Articulated Object Pose Tracking on SE(3) Manifolds

Xianhui Meng, Yukang Huo, Li Zhang, Liu Liu, Haonan Jiang, Yan Zhong, Pingrui Zhang, Cewu Lu, Jun Liu

TL;DR

This paper tackles category-level articulated object pose tracking on the $SE(3)$ manifold by introducing PPF-Tracker, which combines quasi-canonicalization to exploit temporal priors, SE($3$)-invariant voting over weighted Point Pair Features, and a kinematic-constraints-based optimization to enforce physical consistency across articulated parts. Pose increments are learned in the Lie algebra space $\mathfrak{se}(3)$ and mapped to the $SE(3)$ group via the exponential map, ensuring valid rotations and stable updates. Key contributions include a dynamic keyframe strategy, a weighted PPF voting mechanism, joint-axis–aware refinement, and a comprehensive loss design that balances geometry, scale, and masks. Extensive experiments on synthetic, semi-synthetic, and real-world datasets demonstrate strong generalization, robustness, and real-time performance, offering a solid foundation for robotic manipulation, embodied AI, and AR/VR applications.

Abstract

Articulated objects are prevalent in daily life and robotic manipulation tasks. However, compared to rigid objects, pose tracking for articulated objects remains an underexplored problem due to their inherent kinematic constraints. To address these challenges, this work proposes a novel point-pair-based pose tracking framework, termed \textbf{PPF-Tracker}. The proposed framework first performs quasi-canonicalization of point clouds in the SE(3) Lie group space, and then models articulated objects using Point Pair Features (PPF) to predict pose voting parameters by leveraging the invariance properties of SE(3). Finally, semantic information of joint axes is incorporated to impose unified kinematic constraints across all parts of the articulated object. PPF-Tracker is systematically evaluated on both synthetic datasets and real-world scenarios, demonstrating strong generalization across diverse and challenging environments. Experimental results highlight the effectiveness and robustness of PPF-Tracker in multi-frame pose tracking of articulated objects. We believe this work can foster advances in robotics, embodied intelligence, and augmented reality. Codes are available at https://github.com/mengxh20/PPFTracker.

Exploring Category-level Articulated Object Pose Tracking on SE(3) Manifolds

TL;DR

This paper tackles category-level articulated object pose tracking on the manifold by introducing PPF-Tracker, which combines quasi-canonicalization to exploit temporal priors, SE()-invariant voting over weighted Point Pair Features, and a kinematic-constraints-based optimization to enforce physical consistency across articulated parts. Pose increments are learned in the Lie algebra space and mapped to the group via the exponential map, ensuring valid rotations and stable updates. Key contributions include a dynamic keyframe strategy, a weighted PPF voting mechanism, joint-axis–aware refinement, and a comprehensive loss design that balances geometry, scale, and masks. Extensive experiments on synthetic, semi-synthetic, and real-world datasets demonstrate strong generalization, robustness, and real-time performance, offering a solid foundation for robotic manipulation, embodied AI, and AR/VR applications.

Abstract

Articulated objects are prevalent in daily life and robotic manipulation tasks. However, compared to rigid objects, pose tracking for articulated objects remains an underexplored problem due to their inherent kinematic constraints. To address these challenges, this work proposes a novel point-pair-based pose tracking framework, termed \textbf{PPF-Tracker}. The proposed framework first performs quasi-canonicalization of point clouds in the SE(3) Lie group space, and then models articulated objects using Point Pair Features (PPF) to predict pose voting parameters by leveraging the invariance properties of SE(3). Finally, semantic information of joint axes is incorporated to impose unified kinematic constraints across all parts of the articulated object. PPF-Tracker is systematically evaluated on both synthetic datasets and real-world scenarios, demonstrating strong generalization across diverse and challenging environments. Experimental results highlight the effectiveness and robustness of PPF-Tracker in multi-frame pose tracking of articulated objects. We believe this work can foster advances in robotics, embodied intelligence, and augmented reality. Codes are available at https://github.com/mengxh20/PPFTracker.

Paper Structure

This paper contains 14 sections, 14 equations, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: The Categorization of Tracking Methods.
  • Figure 2: The Overview of Our PPF-Tracker.
  • Figure 3: Illustration of Temperal Segment and Dynamic Keyframe. The symbols $\mathcal{K}_i^k$ and $T_n^k$ are the poses of the $i$-th keyframe and the $n$-th frame, respectively. The subscript symbol $[\cdot]$ denotes the mapping of the keyframe to the frame stream. For instance, $[1]=0$ and $[i]=n$.
  • Figure 4: The Illustration of Quasi-Canonicalization within Temporal Segment. For clearer expression, we abstract (a) as (b). The blue part represents the keyframe transformation (canonicalization). The red part depicts the quasi-canonicalization of the $t$-th frame where the frame follows its associated keyframe's transformation.
  • Figure 5: Traditional (a) and Weighted (b) Point Pairs.
  • ...and 3 more figures