Multi-level Dynamic Style Transfer for NeRFs
Zesheng Li, Shuaibo Li, Wei Ma, Jianwei Guo, Hongbin Zha
TL;DR
The paper addresses the challenge of transferring artistic styles to 3D scenes represented by NeRFs while preserving multi-scale spatial structure. It introduces MDS-NeRF, a zero-shot framework built on a redesigned NeRF pipeline with a multi-level feature grid (MLFA) and a dynamic style injection (DSI) module, decoded via a multi-level cascade decoder (MLCD). Training occurs in two stages: Stage1 reconstructs the multi-level feature grid; Stage2 performs stylization by learning LIN and DSI, with the overall loss $L_g = L_f + L_r$. The method supports 2D and 3D style references, enables 3D-to-3D omni-view stylization and style mixing, and demonstrates superior content preservation and stylization quality compared to prior work, albeit with limitations tied to 3D reference quality and shape-changing tasks not addressed. These results suggest zero-shot, view-consistent 3D style transfer is feasible for NeRF-based scenes, with potential applications in AR/VR and design workflows.
Abstract
As the application of neural radiance fields (NeRFs) in various 3D vision tasks continues to expand, numerous NeRF-based style transfer techniques have been developed. However, existing methods typically integrate style statistics into the original NeRF pipeline, often leading to suboptimal results in both content preservation and artistic stylization. In this paper, we present multi-level dynamic style transfer for NeRFs (MDS-NeRF), a novel approach that reengineers the NeRF pipeline specifically for stylization and incorporates an innovative dynamic style injection module. Particularly, we propose a multi-level feature adaptor that helps generate a multi-level feature grid representation from the content radiance field, effectively capturing the multi-scale spatial structure of the scene. In addition, we present a dynamic style injection module that learns to extract relevant style features and adaptively integrates them into the content patterns. The stylized multi-level features are then transformed into the final stylized view through our proposed multi-level cascade decoder. Furthermore, we extend our 3D style transfer method to support omni-view style transfer using 3D style references. Extensive experiments demonstrate that MDS-NeRF achieves outstanding performance for 3D style transfer, preserving multi-scale spatial structures while effectively transferring stylistic characteristics.
