Table of Contents
Fetching ...

EasyHeC++: Fully Automatic Hand-Eye Calibration with Pretrained Image Models

Zhengdong Hong, Kangfu Zheng, Linghao Chen

TL;DR

This work presents a novel framework, EasyHeC++, designed for fully automatic hand-eye calibration, which is the first system that enables accurate calibration of any robot arm in a marker-free, training-free, and fully automatic manner.

Abstract

Hand-eye calibration plays a fundamental role in robotics by directly influencing the efficiency of critical operations such as manipulation and grasping. In this work, we present a novel framework, EasyHeC++, designed for fully automatic hand-eye calibration. In contrast to previous methods that necessitate manual calibration, specialized markers, or the training of arm-specific neural networks, our approach is the first system that enables accurate calibration of any robot arm in a marker-free, training-free, and fully automatic manner. Our approach employs a two-step process. First, we initialize the camera pose using a sampling or feature-matching-based method with the aid of pretrained image models. Subsequently, we perform pose optimization through differentiable rendering. Extensive experiments demonstrate the system's superior accuracy in both synthetic and real-world datasets across various robot arms and camera settings. Project page: https://ootts.github.io/easyhec_plus.

EasyHeC++: Fully Automatic Hand-Eye Calibration with Pretrained Image Models

TL;DR

This work presents a novel framework, EasyHeC++, designed for fully automatic hand-eye calibration, which is the first system that enables accurate calibration of any robot arm in a marker-free, training-free, and fully automatic manner.

Abstract

Hand-eye calibration plays a fundamental role in robotics by directly influencing the efficiency of critical operations such as manipulation and grasping. In this work, we present a novel framework, EasyHeC++, designed for fully automatic hand-eye calibration. In contrast to previous methods that necessitate manual calibration, specialized markers, or the training of arm-specific neural networks, our approach is the first system that enables accurate calibration of any robot arm in a marker-free, training-free, and fully automatic manner. Our approach employs a two-step process. First, we initialize the camera pose using a sampling or feature-matching-based method with the aid of pretrained image models. Subsequently, we perform pose optimization through differentiable rendering. Extensive experiments demonstrate the system's superior accuracy in both synthetic and real-world datasets across various robot arms and camera settings. Project page: https://ootts.github.io/easyhec_plus.

Paper Structure

This paper contains 22 sections, 3 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Comparison between our method and previous methods. Our method not only delivers high accuracy but also is fully automatic, marker-free, and training-free.
  • Figure 2: EasyHeC++ architecture. We consider not only single-instance calibration but also recalibration after camera movement. At each time of calibration, EasyHeC++ consists of two main components: pose initialization and pose optimization. At the first time of calibration, we use a sampling-based pose initialization module to initialize a rough camera pose ${T_{cb}^{init}}$, while in subsequent re-calibrations, we use a feature matching (FM)-based module to initialize the camera pose, using the historical image-pose pairs in the database. Then we run pose optimization by first using a differentiable renderer (DR) to optimize the camera pose and then running a space exploration (SE) to obtain the next joint pose to increase the accuracy. In this process, AutoSAM is proposed to predict mask as the supervision to the DR process. After solving the camera pose $T_{cb}$, we add the image-pose pair to the database.
  • Figure 3: Example images for our method under the eye-in-hand setting. (a) is the image captured by the in-hand camera and (b) is the image captured from the spectator's view for illustration.
  • Figure 4: Ablation study on different types of prompts to the SAM model. (a) A single bounding box of the whole robot arm as the prompt. (b) A single bounding box of the robot arm and its center point as the prompt. (c) Per-link bounding boxes as the prompt. (d) Per-link bounding boxes and their center points as the prompt. (e) Bounding boxes of each link and each connector as the prompt. (f) Bounding boxes of each link and each connector and their center points as the prompt. For clarity, we do not show all the prompts from (c) to (f).