Mirror-Aware Neural Humans
Daniel Ajisafe, James Tang, Shih-Yang Su, Bastian Wandt, Helge Rhodin
TL;DR
This work tackles the challenge of recovering detailed 3D human pose, shape, and appearance from monocular video by leveraging a mirror as a second synchronized view. It introduces Mirror-Aware Neural Humans, a three-stage pipeline that automatically calibrates the camera and mirror, lifts 2D keypoints to a 3D skeleton, and refines a neural radiance field with a layered, occlusion-aware mirror representation. The key contributions are a robust mirror-calibration method, a bone-oriented skeleton representation, and a Layered A-NeRF framework that handles mirror occlusions, yielding improved pose accuracy and sharper appearance over prior mirror-free and single-view methods. The approach enables consumer-level, camera-and-mirror-based 3D motion capture with practical implications for low-cost rehabilitation and related applications.
Abstract
Human motion capture either requires multi-camera systems or is unreliable when using single-view input due to depth ambiguities. Meanwhile, mirrors are readily available in urban environments and form an affordable alternative by recording two views with only a single camera. However, the mirror setting poses the additional challenge of handling occlusions of real and mirror image. Going beyond existing mirror approaches for 3D human pose estimation, we utilize mirrors for learning a complete body model, including shape and dense appearance. Our main contributions are extending articulated neural radiance fields to include a notion of a mirror, making it sample-efficient over potential occlusion regions. Together, our contributions realize a consumer-level 3D motion capture system that starts from off-the-shelf 2D poses by automatically calibrating the camera, estimating mirror orientation, and subsequently lifting 2D keypoint detections to 3D skeleton pose that is used to condition the mirror-aware NeRF. We empirically demonstrate the benefit of learning a body model and accounting for occlusion in challenging mirror scenes.
