Table of Contents
Fetching ...

RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

Zhanyu Tuo

TL;DR

Experimental results demonstrate that RPGD consistently recovers extrinsic parameters with accuracy comparable to the provided ground truth, achieving sub-pixel MPJPE reprojection error even in challenging, noisy settings.

Abstract

In this paper, we propose RPGD (RANSAC-P3P Gradient Descent), a human-pose-driven extrinsic calibration framework that robustly aligns MoCap-based 3D skeletal data with monocular or multi-view RGB cameras using only natural human motion. RPGD formulates extrinsic calibration as a coarse-to-fine problem tailored to human poses, combining the global robustness of RANSAC-P3P with Gradient-Descent-based refinement. We evaluate RPGD on three large-scale public 3D HPE datasets as well as on a self-collected in-the-wild dataset. Experimental results demonstrate that RPGD consistently recovers extrinsic parameters with accuracy comparable to the provided ground truth, achieving sub-pixel MPJPE reprojection error even in challenging, noisy settings. These results indicate that RPGD provides a practical and automatic solution for reliable extrinsic calibration of large-scale 3D HPE dataset collection.

RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

TL;DR

Experimental results demonstrate that RPGD consistently recovers extrinsic parameters with accuracy comparable to the provided ground truth, achieving sub-pixel MPJPE reprojection error even in challenging, noisy settings.

Abstract

In this paper, we propose RPGD (RANSAC-P3P Gradient Descent), a human-pose-driven extrinsic calibration framework that robustly aligns MoCap-based 3D skeletal data with monocular or multi-view RGB cameras using only natural human motion. RPGD formulates extrinsic calibration as a coarse-to-fine problem tailored to human poses, combining the global robustness of RANSAC-P3P with Gradient-Descent-based refinement. We evaluate RPGD on three large-scale public 3D HPE datasets as well as on a self-collected in-the-wild dataset. Experimental results demonstrate that RPGD consistently recovers extrinsic parameters with accuracy comparable to the provided ground truth, achieving sub-pixel MPJPE reprojection error even in challenging, noisy settings. These results indicate that RPGD provides a practical and automatic solution for reliable extrinsic calibration of large-scale 3D HPE dataset collection.
Paper Structure (26 sections, 7 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 26 sections, 7 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of the RPGD extrinsic calibration framework. The pipeline consists of two stages: First, a RANSAC-P3P solver estimates the initial rotation $\mathbf{R}_m$ and translation $\mathbf{t}_m$ by maximizing the number of inliers. Second, the estimation from step one is fine-tuned based on gradient descent to minimize the reprojection error of 2D MPJPE.
  • Figure 2: Average 2D MPJPE (in Pixel) of each sequence across three public 3D HPE datasets. The rounds in blue indicate the average GT MPJPE , the plus in orange indicate the average MPJPE after RANSAC-P3P, the stars in green indicate the average MPJPE by RPGD. The X-axis represents the Sequence Index, the Y-axis is the MPJPE value in pixel. (a) MPJPE on each MPI-INF-3DHP Sequence. (b) MPJPE on each Human3.6M Sequence. (c-1, c-2) MPJPE on each AIST++ Sequence.
  • Figure 3: Qualitative evaluation on in-the-wild scene. Left: Raw 3D skeleton keypoints captured by Kinect sensor. Middle: Ground truth 2D keypoints annotation. Right: 2D keypoints projected from raw 3D skeleton keypoints by extrinsic matrix from RPGD.