Table of Contents
Fetching ...

MirrorCalib: Utilizing Human Pose Information for Mirror-based Virtual Camera Calibration

Longyun Liao, Rong Zheng, Andrew Mitchell

TL;DR

MirrorCalib tackles the challenging problem of calibrating a virtual camera relative to a real camera in mirror-involved exercise videos where views have little overlap. It combines a modified eight-point algorithm under mirror constraints, a decomposition of the reflective essential matrix, and a body-prior-driven refinement of 2D joints with RANSAC-based outlier rejection to estimate the extrinsic parameters. The approach yields accurate extrinsics on real data (rotation ~$1.82$–$2.57^\circ$, translation ~$69$–$91$ mm) and improves downstream 3D pose estimation (PA-MPJPE ~ $68.5$ mm) compared to baselines. This work enables robust triangulation-based 3D reconstruction in coaching and rehabilitation contexts where mirror views are common and feature correspondences are scarce.

Abstract

In this paper, we present the novel task of estimating the extrinsic parameters of a virtual camera relative to a real camera in exercise videos with a mirror. This task poses a significant challenge in scenarios where the views from the real and mirrored cameras have no overlap or share salient features. To address this issue, prior knowledge of a human body and 2D joint locations are utilized to estimate the camera extrinsic parameters when a person is in front of a mirror. We devise a modified eight-point algorithm to obtain an initial estimation from 2D joint locations. The 2D joint locations are then refined subject to human body constraints. Finally, a RANSAC algorithm is employed to remove outliers by comparing their epipolar distances to a predetermined threshold. MirrorCalib achieves a rotation error of 1.82° and a translation error of 69.51 mm on a collected real-world dataset, which outperforms the state-of-art method.

MirrorCalib: Utilizing Human Pose Information for Mirror-based Virtual Camera Calibration

TL;DR

MirrorCalib tackles the challenging problem of calibrating a virtual camera relative to a real camera in mirror-involved exercise videos where views have little overlap. It combines a modified eight-point algorithm under mirror constraints, a decomposition of the reflective essential matrix, and a body-prior-driven refinement of 2D joints with RANSAC-based outlier rejection to estimate the extrinsic parameters. The approach yields accurate extrinsics on real data (rotation ~, translation ~ mm) and improves downstream 3D pose estimation (PA-MPJPE ~ mm) compared to baselines. This work enables robust triangulation-based 3D reconstruction in coaching and rehabilitation contexts where mirror views are common and feature correspondences are scarce.

Abstract

In this paper, we present the novel task of estimating the extrinsic parameters of a virtual camera relative to a real camera in exercise videos with a mirror. This task poses a significant challenge in scenarios where the views from the real and mirrored cameras have no overlap or share salient features. To address this issue, prior knowledge of a human body and 2D joint locations are utilized to estimate the camera extrinsic parameters when a person is in front of a mirror. We devise a modified eight-point algorithm to obtain an initial estimation from 2D joint locations. The 2D joint locations are then refined subject to human body constraints. Finally, a RANSAC algorithm is employed to remove outliers by comparing their epipolar distances to a predetermined threshold. MirrorCalib achieves a rotation error of 1.82° and a translation error of 69.51 mm on a collected real-world dataset, which outperforms the state-of-art method.
Paper Structure (20 sections, 16 equations, 3 figures, 2 tables)

This paper contains 20 sections, 16 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of MirrorCalib. The process takes a video of a human in front of a mirror and passes it through a 2D human pose estimator to obtain the 2D joint locations for both the real and mirrored human. Using a modified eight-point algorithm, an initial estimation of the virtual camera pose is obtained. This estimation is then refined through an optimization process, and a RANSAC algorithm is used to choose the best estimation among all available results.
  • Figure 2: The coordinates frame of real and virtual cameras and their relationship. $\mathbf{R}$, $\mathbf{t}$, $\mathbf{D}$ are rotation, translation and reflection transformation respectively.
  • Figure 3: Real dataset collection setup. The chessboard can be captured in both real and virtual views. The corners on the chessboard are only for ground-truth derivation, and are not the key points for MirrorCalib.