Table of Contents
Fetching ...

Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation

Xiao Lin, Wenfei Yang, Yuan Gao, Tianzhu Zhang

TL;DR

A novel Instance-Adaptive and Geometric-Aware Keypoint Learning method for category-level 6D object pose estimation (AG-Pose), which includes two key designs that work together to establish robust keypoint-level correspondences for unseen instances, thus enhancing the generalization ability of the model.

Abstract

Category-level 6D object pose estimation aims to estimate the rotation, translation and size of unseen instances within specific categories. In this area, dense correspondence-based methods have achieved leading performance. However, they do not explicitly consider the local and global geometric information of different instances, resulting in poor generalization ability to unseen instances with significant shape variations. To deal with this problem, we propose a novel Instance-Adaptive and Geometric-Aware Keypoint Learning method for category-level 6D object pose estimation (AG-Pose), which includes two key designs: (1) The first design is an Instance-Adaptive Keypoint Detection module, which can adaptively detect a set of sparse keypoints for various instances to represent their geometric structures. (2) The second design is a Geometric-Aware Feature Aggregation module, which can efficiently integrate the local and global geometric information into keypoint features. These two modules can work together to establish robust keypoint-level correspondences for unseen instances, thus enhancing the generalization ability of the model.Experimental results on CAMERA25 and REAL275 datasets show that the proposed AG-Pose outperforms state-of-the-art methods by a large margin without category-specific shape priors.

Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation

TL;DR

A novel Instance-Adaptive and Geometric-Aware Keypoint Learning method for category-level 6D object pose estimation (AG-Pose), which includes two key designs that work together to establish robust keypoint-level correspondences for unseen instances, thus enhancing the generalization ability of the model.

Abstract

Category-level 6D object pose estimation aims to estimate the rotation, translation and size of unseen instances within specific categories. In this area, dense correspondence-based methods have achieved leading performance. However, they do not explicitly consider the local and global geometric information of different instances, resulting in poor generalization ability to unseen instances with significant shape variations. To deal with this problem, we propose a novel Instance-Adaptive and Geometric-Aware Keypoint Learning method for category-level 6D object pose estimation (AG-Pose), which includes two key designs: (1) The first design is an Instance-Adaptive Keypoint Detection module, which can adaptively detect a set of sparse keypoints for various instances to represent their geometric structures. (2) The second design is a Geometric-Aware Feature Aggregation module, which can efficiently integrate the local and global geometric information into keypoint features. These two modules can work together to establish robust keypoint-level correspondences for unseen instances, thus enhancing the generalization ability of the model.Experimental results on CAMERA25 and REAL275 datasets show that the proposed AG-Pose outperforms state-of-the-art methods by a large margin without category-specific shape priors.
Paper Structure (17 sections, 12 equations, 5 figures, 7 tables)

This paper contains 17 sections, 12 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: a) The visualization for the correspondence error map and final pose estimation of the dense correspondence-based method, DPDN lin2022dpdn. Green/red indicates small/large errors and GT/predicted bounding box. b) Points belonging to different parts of the same instance may exhibit similar visual features. Thus, the local geometric information is essential to distinguish them from each other. c) Points belonging to different instances may exhibit similar local geometric structures. Therefore, the global geometric information is crucial for correctly mapping them to the corresponding NOCS coordinates.
  • Figure 2: a) Overview of the proposed AG-Pose. b) Illustration of the IAKD module. We initialize a set of category-shared learnable queries and convert them into instance-adaptive detectors by integrating the object features. The instance-adaptive detectors are then used to detect keypoints for the object. To guide the learning of the IAKD module, we futher design the $L_{div}$ and $L_{ocd}$ to constrain the distribution of keypoints. c) Illustration of the GAFA module. Our GAFA can efficiently integrate the geometric information into keypoint features through a two-stage feature aggregation process.
  • Figure 3: Illustration of the outlier filter process.
  • Figure 4: Comparisons of NOCS error distributions.
  • Figure 5: Qualitative comparisons between our method and DPDN lin2022dpdn on REAL275 dataset. We visualize the correspondence error maps and pose estimation results of our AG-Pose and DPDN. Red/green indicates large/small errors and predicted/gt bounding boxes.