S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes
Xingyi Li, Zhiguo Cao, Yizheng Wu, Kewei Wang, Ke Xian, Zhe Wang, Guosheng Lin
TL;DR
S-DyRF tackles the problem of stylizing dynamic 3D scenes with limited stylized references by introducing temporal pseudo-references and a two-stage spatio-temporal transfer on dynamic neural radiance fields. The method builds on a pre-trained dynamic field $F_{ heta}$, renders a stylized reference $\, abla S_R^k$, and uses temporal pseudo-references to propagate style across time, followed by coarse feature-level transfer and a fine, temporally aware refinement via Temporal Reference Ray Registration. The optimization combines coarse and fine stylization losses with a temporal total-variation regularizer, operating on a 4D scene representation (space and time) to produce stylized novel views and times that remain semantically aligned with the reference. Experiments on synthetic and real data demonstrate improved perceptual similarity and temporal/spatial consistency over baselines (ARF*, Ref-NPR*, Texler) and reveal strong user preference, underscoring practical impact for controllable 3D art and design. Overall, S-DyRF enables flexible, temporally coherent stylization of dynamic 3D scenes with minimal reference input, expanding the frontier of reference-guided 3D stylization.
Abstract
Current 3D stylization methods often assume static scenes, which violates the dynamic nature of our real world. To address this limitation, we present S-DyRF, a reference-based spatio-temporal stylization method for dynamic neural radiance fields. However, stylizing dynamic 3D scenes is inherently challenging due to the limited availability of stylized reference images along the temporal axis. Our key insight lies in introducing additional temporal cues besides the provided reference. To this end, we generate temporal pseudo-references from the given stylized reference. These pseudo-references facilitate the propagation of style information from the reference to the entire dynamic 3D scene. For coarse style transfer, we enforce novel views and times to mimic the style details present in pseudo-references at the feature level. To preserve high-frequency details, we create a collection of stylized temporal pseudo-rays from temporal pseudo-references. These pseudo-rays serve as detailed and explicit stylization guidance for achieving fine style transfer. Experiments on both synthetic and real-world datasets demonstrate that our method yields plausible stylized results of space-time view synthesis on dynamic 3D scenes.
