CNSv2: Probabilistic Correspondence Encoded Neural Image Servo

Anzhe Chen; Hongxiang Yu; Shuxin Li; Yuxi Chen; Zhongxiang Zhou; Wentao Sun; Rong Xiong; Yue Wang

CNSv2: Probabilistic Correspondence Encoded Neural Image Servo

Anzhe Chen, Hongxiang Yu, Shuxin Li, Yuxi Chen, Zhongxiang Zhou, Wentao Sun, Rong Xiong, Yue Wang

TL;DR

CNSv2 tackles robustness gaps in visual servo caused by unreliable keypoint matching by introducing a probabilistic, multimodal correspondence conditioned neural policy. It employs a translation-equivariant probabilistic matching representation derived from foundation-model features, paired with a Transformer-based controller and velocity denormalization to decouple training from real-world intrinsics and scene scales. Key contributions include the resolution-agnostic anchoring of matching scores, a velocity denormalization strategy, and a hybrid control scheme that blends image-space and Cartesian trajectories, all demonstrated in simulation and real-world experiments with textureless and illumination-variant scenes. The approach achieves real-time performance and improved generalization across unseen scenes, offering a practical path toward robust visual servo in challenging environments.

Abstract

Visual servo based on traditional image matching methods often requires accurate keypoint correspondence for high precision control. However, keypoint detection or matching tends to fail in challenging scenarios with inconsistent illuminations or textureless objects, resulting significant performance degradation. Previous approaches, including our proposed Correspondence encoded Neural image Servo policy (CNS), attempted to alleviate these issues by integrating neural control strategies. While CNS shows certain improvement against error correspondence over conventional image-based controllers, it could not fully resolve the limitations arising from poor keypoint detection and matching. In this paper, we continue to address this problem and propose a new solution: Probabilistic Correspondence Encoded Neural Image Servo (CNSv2). CNSv2 leverages probabilistic feature matching to improve robustness in challenging scenarios. By redesigning the architecture to condition on multimodal feature matching, CNSv2 achieves high precision, improved robustness across diverse scenes and runs in real-time. We validate CNSv2 with simulations and real-world experiments, demonstrating its effectiveness in overcoming the limitations of detector-based methods in visual servo tasks.

CNSv2: Probabilistic Correspondence Encoded Neural Image Servo

TL;DR

Abstract

CNSv2: Probabilistic Correspondence Encoded Neural Image Servo

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)