Table of Contents
Fetching ...

Drantal-NeRF: Diffusion-Based Restoration for Anti-aliasing Neural Radiance Field

Ganlin Yang, Kaidong Zhang, Jingjing Fu, Dong Liu

TL;DR

Drantal-NeRF addresses aliasing in Neural Radiance Field renderings by introducing a diffusion-based restoration pipeline. It employs a two-stage training strategy: first finetuning a pretrained diffusion model conditioned on aliased NeRF outputs while training NeRF, then refining with a controllable feature-wrapping module and adversarial VAE-decoder training to enforce high-fidelity, multi-view-consistent restorations, all in a NeRF-agnostic framework. The approach yields substantial improvements on large-scale urban (MatrixCity) and unbounded 360-degree (MipNeRF-360) datasets, evidenced by quantitative gains in $PSNR$, $SSIM$, and $LPIPS$ and by crisper, more texture-rich visuals. Overall, Drantal demonstrates that diffusion priors can serve as a robust, general post-processing tool for anti-aliasing in NeRF backbones, potentially guiding future restoration-oriented 3D pipelines.

Abstract

Aliasing artifacts in renderings produced by Neural Radiance Field (NeRF) is a long-standing but complex issue in the field of 3D implicit representation, which arises from a multitude of intricate causes and was mitigated by designing more advanced but complex scene parameterization methods before. In this paper, we present a Diffusion-based restoration method for anti-aliasing Neural Radiance Field (Drantal-NeRF). We consider the anti-aliasing issue from a low-level restoration perspective by viewing aliasing artifacts as a kind of degradation model added to clean ground truths. By leveraging the powerful prior knowledge encapsulated in diffusion model, we could restore the high-realism anti-aliasing renderings conditioned on aliased low-quality counterparts. We further employ a feature-wrapping operation to ensure multi-view restoration consistency and finetune the VAE decoder to better adapt to the scene-specific data distribution. Our proposed method is easy to implement and agnostic to various NeRF backbones. We conduct extensive experiments on challenging large-scale urban scenes as well as unbounded 360-degree scenes and achieve substantial qualitative and quantitative improvements.

Drantal-NeRF: Diffusion-Based Restoration for Anti-aliasing Neural Radiance Field

TL;DR

Drantal-NeRF addresses aliasing in Neural Radiance Field renderings by introducing a diffusion-based restoration pipeline. It employs a two-stage training strategy: first finetuning a pretrained diffusion model conditioned on aliased NeRF outputs while training NeRF, then refining with a controllable feature-wrapping module and adversarial VAE-decoder training to enforce high-fidelity, multi-view-consistent restorations, all in a NeRF-agnostic framework. The approach yields substantial improvements on large-scale urban (MatrixCity) and unbounded 360-degree (MipNeRF-360) datasets, evidenced by quantitative gains in , , and and by crisper, more texture-rich visuals. Overall, Drantal demonstrates that diffusion priors can serve as a robust, general post-processing tool for anti-aliasing in NeRF backbones, potentially guiding future restoration-oriented 3D pipelines.

Abstract

Aliasing artifacts in renderings produced by Neural Radiance Field (NeRF) is a long-standing but complex issue in the field of 3D implicit representation, which arises from a multitude of intricate causes and was mitigated by designing more advanced but complex scene parameterization methods before. In this paper, we present a Diffusion-based restoration method for anti-aliasing Neural Radiance Field (Drantal-NeRF). We consider the anti-aliasing issue from a low-level restoration perspective by viewing aliasing artifacts as a kind of degradation model added to clean ground truths. By leveraging the powerful prior knowledge encapsulated in diffusion model, we could restore the high-realism anti-aliasing renderings conditioned on aliased low-quality counterparts. We further employ a feature-wrapping operation to ensure multi-view restoration consistency and finetune the VAE decoder to better adapt to the scene-specific data distribution. Our proposed method is easy to implement and agnostic to various NeRF backbones. We conduct extensive experiments on challenging large-scale urban scenes as well as unbounded 360-degree scenes and achieve substantial qualitative and quantitative improvements.
Paper Structure (23 sections, 10 equations, 12 figures, 5 tables, 1 algorithm)

This paper contains 23 sections, 10 equations, 12 figures, 5 tables, 1 algorithm.

Figures (12)

  • Figure 1: Overview of our proposed method Drantal, which involves two-stage training. At the first stage, we optimize a Neural Radiance Field along with the time-aware encoder $\mathcal{E}_\phi$ and SFT layers in the diffusion model, marked as green. At the second stage, we optimize the controllable feature wrapping (CFW) module as well as the VAE decoder $\mathcal{D}$, distinguished in blue. The VAE encoder $\mathcal{E}$ and the rest of the parameters in the diffusion model are kept fixed, depicted in orange. After the whole training process, we are able to restore the high-quality anti-aliasing renderings from the degraded ones output by Neural Radiance Field.
  • Figure 2: Visualizations on aerial scenes in MatrixCity dataset. Our proposed Drantal significantly produces renderings with more realistic texture details and much less aliasing artifacts, blurs and floaters. We crop a patch in each whole image and zoom in for more detailed comparison.
  • Figure 3: Visualizations for scene stump, counter and treehill (from top to down) in MipNeRF-360 dataset. Drantal exhibits strong capability in restoring more realistic anti-aliasing renderings, with sharper edges and finer details. We crop a patch and zoom in for more detailed comparison.
  • Figure 4: Line chart plotting the trend for performance gains with the change of voxel resolutions. Solid line: PSNR. Dash line: LPIPS.
  • Figure 5: The detailed structure for controllable feature wrapping (CFW) module, which is depicted on the right side. We omit the inner structure of Stable Diffusion model for brevity.
  • ...and 7 more figures