Table of Contents
Fetching ...

CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields

Deheng Zhang, Clara Fernandez-Labrador, Christopher Schroers

TL;DR

CoARF addresses the lack of fine-grained controllability in radiance-field style transfer by introducing a multi-view 2D mask-based optimization framework that supports object selection, compositional transfers, and semantic-aware stylization. It augments the baseline ARF approach with label-dependent losses and a semantic-aware nearest-neighbor matching (SANNFM) that blends VGG texture cues with LSeg semantics. The method demonstrates superior stylization quality and controllability across forward-facing and 360° scenes, outperforming ARF and StyleRF in both qualitative and user-study evaluations. The approach is computationally efficient on modern GPUs and can generalize to other differentiable radiance-field representations, enabling practical 3D artistic editing in film and games. The key contributions are the multi-view 2D mask-based optimization, the object/compositional/semantic control modules, and the SANNFM mechanism that improves semantic alignment during style transfer.

Abstract

Creating artistic 3D scenes can be time-consuming and requires specialized knowledge. To address this, recent works such as ARF, use a radiance field-based approach with style constraints to generate 3D scenes that resemble a style image provided by the user. However, these methods lack fine-grained control over the resulting scenes. In this paper, we introduce Controllable Artistic Radiance Fields (CoARF), a novel algorithm for controllable 3D scene stylization. CoARF enables style transfer for specified objects, compositional 3D style transfer and semantic-aware style transfer. We achieve controllability using segmentation masks with different label-dependent loss functions. We also propose a semantic-aware nearest neighbor matching algorithm to improve the style transfer quality. Our extensive experiments demonstrate that CoARF provides user-specified controllability of style transfer and superior style transfer quality with more precise feature matching.

CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields

TL;DR

CoARF addresses the lack of fine-grained controllability in radiance-field style transfer by introducing a multi-view 2D mask-based optimization framework that supports object selection, compositional transfers, and semantic-aware stylization. It augments the baseline ARF approach with label-dependent losses and a semantic-aware nearest-neighbor matching (SANNFM) that blends VGG texture cues with LSeg semantics. The method demonstrates superior stylization quality and controllability across forward-facing and 360° scenes, outperforming ARF and StyleRF in both qualitative and user-study evaluations. The approach is computationally efficient on modern GPUs and can generalize to other differentiable radiance-field representations, enabling practical 3D artistic editing in film and games. The key contributions are the multi-view 2D mask-based optimization, the object/compositional/semantic control modules, and the SANNFM mechanism that improves semantic alignment during style transfer.

Abstract

Creating artistic 3D scenes can be time-consuming and requires specialized knowledge. To address this, recent works such as ARF, use a radiance field-based approach with style constraints to generate 3D scenes that resemble a style image provided by the user. However, these methods lack fine-grained control over the resulting scenes. In this paper, we introduce Controllable Artistic Radiance Fields (CoARF), a novel algorithm for controllable 3D scene stylization. CoARF enables style transfer for specified objects, compositional 3D style transfer and semantic-aware style transfer. We achieve controllability using segmentation masks with different label-dependent loss functions. We also propose a semantic-aware nearest neighbor matching algorithm to improve the style transfer quality. Our extensive experiments demonstrate that CoARF provides user-specified controllability of style transfer and superior style transfer quality with more precise feature matching.
Paper Structure (19 sections, 16 equations, 12 figures)

This paper contains 19 sections, 16 equations, 12 figures.

Figures (12)

  • Figure 1: CoARF Overview. Given a set of multi-view ground truth images and style images, our controllable style transfer model allows the user to perform object selection (a), compositional style transfer (b), and semantic-aware style transfer (c) by using 2D mask-based optimization with different spatial-dependent loss definitions to achieve stylization.
  • Figure 2: Pipeline of Semantic-aware Style Transfer. The multi-view images and style image are used to extract VGG features $\textbf{F}_{r}^{VGG}, \textbf{F}_{s}^{VGG}$ and LSeg features $\textbf{F}_{r}^{LSeg}, \textbf{F}_{s}^{LSeg}$. Then the cosine distance is calculated and blended using the hyperparameter $\alpha$. As shown in the SANNFM module in the figure, we use the ellipses to represent different semantic labels, use the different shapes (square, triangle, circle) to represent semantic information of the pixels, use the color of the shape to represent the color and textural information of the pixels. The mixed distance is used to match the nearest neighbor in the style image with the same label for each pixel in the rendered image. Finally, the optimization uses VGG cosine distance only.
  • Figure 3: Multi-view correction.(a) For one view optimization, point A should be optimized using background loss, but optimized using foreground object loss. (b) Multi-view ray passing through point A. The first two views have the correct label for point A, represented as green rays, last two views have an incorrect label for point A, represented as red rays. The final gradient is dominated by correct loss gradients in the first two views.
  • Figure 4: Object selection result. Our algorithm can generate the stylized object for the selected region and keep other objects photorealistic.
  • Figure 5: Qualitative comparison of compositional style transfer. Our result (a) can match the style image better in terms of the brushstrokes and color compared with StyleRF liu2023stylerf (b).
  • ...and 7 more figures