RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

Zhanyu Tuo

RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

Zhanyu Tuo

TL;DR

Experimental results demonstrate that RPGD consistently recovers extrinsic parameters with accuracy comparable to the provided ground truth, achieving sub-pixel MPJPE reprojection error even in challenging, noisy settings.

Abstract

In this paper, we propose RPGD (RANSAC-P3P Gradient Descent), a human-pose-driven extrinsic calibration framework that robustly aligns MoCap-based 3D skeletal data with monocular or multi-view RGB cameras using only natural human motion. RPGD formulates extrinsic calibration as a coarse-to-fine problem tailored to human poses, combining the global robustness of RANSAC-P3P with Gradient-Descent-based refinement. We evaluate RPGD on three large-scale public 3D HPE datasets as well as on a self-collected in-the-wild dataset. Experimental results demonstrate that RPGD consistently recovers extrinsic parameters with accuracy comparable to the provided ground truth, achieving sub-pixel MPJPE reprojection error even in challenging, noisy settings. These results indicate that RPGD provides a practical and automatic solution for reliable extrinsic calibration of large-scale 3D HPE dataset collection.

RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

TL;DR

Abstract

Paper Structure (26 sections, 7 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 26 sections, 7 equations, 3 figures, 3 tables, 1 algorithm.

Introduction
Related Work
3D Human Pose Estimation
Extrinsic Matrix Calibration
Method
Problem Formulation
RANSAC-P3P Initialization
Gradient-Descent-based Refinement
Objective Function
Gradient Computation
Optimization Details
Experiments
Datasets
Implementation Details
Evaluation Protocol
...and 11 more sections

Figures (3)

Figure 1: Overview of the RPGD extrinsic calibration framework. The pipeline consists of two stages: First, a RANSAC-P3P solver estimates the initial rotation $\mathbf{R}_m$ and translation $\mathbf{t}_m$ by maximizing the number of inliers. Second, the estimation from step one is fine-tuned based on gradient descent to minimize the reprojection error of 2D MPJPE.
Figure 2: Average 2D MPJPE (in Pixel) of each sequence across three public 3D HPE datasets. The rounds in blue indicate the average GT MPJPE , the plus in orange indicate the average MPJPE after RANSAC-P3P, the stars in green indicate the average MPJPE by RPGD. The X-axis represents the Sequence Index, the Y-axis is the MPJPE value in pixel. (a) MPJPE on each MPI-INF-3DHP Sequence. (b) MPJPE on each Human3.6M Sequence. (c-1, c-2) MPJPE on each AIST++ Sequence.
Figure 3: Qualitative evaluation on in-the-wild scene. Left: Raw 3D skeleton keypoints captured by Kinect sensor. Middle: Ground truth 2D keypoints annotation. Right: 2D keypoints projected from raw 3D skeleton keypoints by extrinsic matrix from RPGD.

RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

TL;DR

Abstract

RPGD: RANSAC-P3P Gradient Descent for Extrinsic Calibration in 3D Human Pose Estimation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)