Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

Hao-Jen Chien; Yi-Chuan Huang; Chung-Ho Wu; Wei-Lun Chao; Yu-Lun Liu

Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

Hao-Jen Chien, Yi-Chuan Huang, Chung-Ho Wu, Wei-Lun Chao, Yu-Lun Liu

TL;DR

<3-5 sentence high-level summary> Splannequin tackles freezing monocular Mannequin-Challenge footage by addressing artifacts caused by ill-supervised Gaussian primitives in dynamic Gaussian splatting. It introduces a dual-state regularization that classifies Gaussians as hidden or defective and anchors them to reliable past or future frames, all within an architecture-agnostic, zero-inference-overhead framework. The approach yields substantial perceptual quality improvements and enables real-time, user-controlled freeze-time renderings, demonstrated on real and synthetic data with strong user preference results. This work enables high-fidelity, freeze-time visualizations for consumer videos, VR/AR experiences, and data augmentation for dynamic scene understanding while maintaining efficiency.

Abstract

Synthesizing high-fidelity frozen 3D scenes from monocular Mannequin-Challenge (MC) videos is a unique problem distinct from standard dynamic scene reconstruction. Instead of focusing on modeling motion, our goal is to create a frozen scene while strategically preserving subtle dynamics to enable user-controlled instant selection. To achieve this, we introduce a novel application of dynamic Gaussian splatting: the scene is modeled dynamically, which retains nearby temporal variation, and a static scene is rendered by fixing the model's time parameter. However, under this usage, monocular capture with sparse temporal supervision introduces artifacts like ghosting and blur for Gaussians that become unobserved or occluded at weakly supervised timestamps. We propose Splannequin, an architecture-agnostic regularization that detects two states of Gaussian primitives, hidden and defective, and applies temporal anchoring. Under predominantly forward camera motion, hidden states are anchored to their recent well-observed past states, while defective states are anchored to future states with stronger supervision. Our method integrates into existing dynamic Gaussian pipelines via simple loss terms, requires no architectural changes, and adds zero inference overhead. This results in markedly improved visual quality, enabling high-fidelity, user-selectable frozen-time renderings, validated by a 96% user preference. Project page: https://chien90190.github.io/splannequin/

Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

TL;DR

Abstract

Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)