CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields
Deheng Zhang, Clara Fernandez-Labrador, Christopher Schroers
TL;DR
CoARF addresses the lack of fine-grained controllability in radiance-field style transfer by introducing a multi-view 2D mask-based optimization framework that supports object selection, compositional transfers, and semantic-aware stylization. It augments the baseline ARF approach with label-dependent losses and a semantic-aware nearest-neighbor matching (SANNFM) that blends VGG texture cues with LSeg semantics. The method demonstrates superior stylization quality and controllability across forward-facing and 360° scenes, outperforming ARF and StyleRF in both qualitative and user-study evaluations. The approach is computationally efficient on modern GPUs and can generalize to other differentiable radiance-field representations, enabling practical 3D artistic editing in film and games. The key contributions are the multi-view 2D mask-based optimization, the object/compositional/semantic control modules, and the SANNFM mechanism that improves semantic alignment during style transfer.
Abstract
Creating artistic 3D scenes can be time-consuming and requires specialized knowledge. To address this, recent works such as ARF, use a radiance field-based approach with style constraints to generate 3D scenes that resemble a style image provided by the user. However, these methods lack fine-grained control over the resulting scenes. In this paper, we introduce Controllable Artistic Radiance Fields (CoARF), a novel algorithm for controllable 3D scene stylization. CoARF enables style transfer for specified objects, compositional 3D style transfer and semantic-aware style transfer. We achieve controllability using segmentation masks with different label-dependent loss functions. We also propose a semantic-aware nearest neighbor matching algorithm to improve the style transfer quality. Our extensive experiments demonstrate that CoARF provides user-specified controllability of style transfer and superior style transfer quality with more precise feature matching.
