Table of Contents
Fetching ...

Perception-Control Coupled Visual Servoing for Textureless Objects Using Keypoint-Based EKF

Allen Tao, Jun Yang, Stanko Oparnica, Wenjie Xue

TL;DR

This work builds upon learning-based keypoint detection for textureless objects and proposes a method that enhances robustness by tightly integrating perception and control in a closed loop, and employs an Extended Kalman Filter that integrates per-frame keypoint measurements to estimate 6D object pose.

Abstract

Visual servoing is fundamental to robotic applications, enabling precise positioning and control. However, applying it to textureless objects remains a challenge due to the absence of reliable visual features. Moreover, adverse visual conditions, such as occlusions, often corrupt visual feedback, leading to reduced accuracy and instability in visual servoing. In this work, we build upon learning-based keypoint detection for textureless objects and propose a method that enhances robustness by tightly integrating perception and control in a closed loop. Specifically, we employ an Extended Kalman Filter (EKF) that integrates per-frame keypoint measurements to estimate 6D object pose, which drives pose-based visual servoing (PBVS) for control. The resulting camera motion, in turn, enhances the tracking of subsequent keypoints, effectively closing the perception-control loop. Additionally, unlike standard PBVS, we propose a probabilistic control law that computes both camera velocity and its associated uncertainty, enabling uncertainty-aware control for safe and reliable operation. We validate our approach on real-world robotic platforms using quantitative metrics and grasping experiments, demonstrating that our method outperforms traditional visual servoing techniques in both accuracy and practical application.

Perception-Control Coupled Visual Servoing for Textureless Objects Using Keypoint-Based EKF

TL;DR

This work builds upon learning-based keypoint detection for textureless objects and proposes a method that enhances robustness by tightly integrating perception and control in a closed loop, and employs an Extended Kalman Filter that integrates per-frame keypoint measurements to estimate 6D object pose.

Abstract

Visual servoing is fundamental to robotic applications, enabling precise positioning and control. However, applying it to textureless objects remains a challenge due to the absence of reliable visual features. Moreover, adverse visual conditions, such as occlusions, often corrupt visual feedback, leading to reduced accuracy and instability in visual servoing. In this work, we build upon learning-based keypoint detection for textureless objects and propose a method that enhances robustness by tightly integrating perception and control in a closed loop. Specifically, we employ an Extended Kalman Filter (EKF) that integrates per-frame keypoint measurements to estimate 6D object pose, which drives pose-based visual servoing (PBVS) for control. The resulting camera motion, in turn, enhances the tracking of subsequent keypoints, effectively closing the perception-control loop. Additionally, unlike standard PBVS, we propose a probabilistic control law that computes both camera velocity and its associated uncertainty, enabling uncertainty-aware control for safe and reliable operation. We validate our approach on real-world robotic platforms using quantitative metrics and grasping experiments, demonstrating that our method outperforms traditional visual servoing techniques in both accuracy and practical application.
Paper Structure (20 sections, 27 equations, 8 figures, 2 tables)

This paper contains 20 sections, 27 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Perception–control coupled visual servoing framework. An EKF integrates complementary information from keypoints and the motion prior, producing reliable 6D object poses for visual servoing.
  • Figure 2: Our framework operates in a closed-loop cycle: during the perception stage, the EKF fuses keypoints and the motion prior to estimate 6D poses; during the control stage, a probabilistic control law computes camera velocity, which actuates the robot and serves as the motion prior for the next perception stage.
  • Figure 3: Example of 3D object points $\mathbf{X}_o$, along with 2D image keypoints, $\mathbf{u}_i$, and their associated uncertainties, $\boldsymbol{\Sigma}_{\mathbf{u}_i}$, as estimated using PVNet peng2019pvnet.
  • Figure 4: Experimental setup. Top: Hardware platform for visual servoing. Bottom: Target objects for evaluation.
  • Figure 5: Baseline approaches used in our evaluation. (a) IBVS+PVNet. (b) PBVS+PVNet. (c) Experiments with different backgrounds, lighting conditions, and occlusions.
  • ...and 3 more figures