Table of Contents
Fetching ...

Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View

Dogyoon Lee, Donghyeong Kim, Jungho Lee, Minhyeok Lee, Seunghoon Lee, Sangyoun Lee

TL;DR

The Sparse-DeRF successfully regularizes the complicated joint optimization of blur kernels and NeRF from sparse view, presenting alleviated overfitting artifacts and enhanced quality on radiance fields.

Abstract

Recent studies construct deblurred neural radiance fields~(DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available. This paper focuses on constructing DeRF from sparse-view for more pragmatic real-world scenarios. As observed in our experiments, establishing DeRF from sparse views proves to be a more challenging problem due to the inherent complexity arising from the simultaneous optimization of blur kernels and NeRF from sparse view. Sparse-DeRF successfully regularizes the complicated joint optimization, presenting alleviated overfitting artifacts and enhanced quality on radiance fields. The regularization consists of three key components: Surface smoothness, helps the model accurately predict the scene structure utilizing unseen and additional hidden rays derived from the blur kernel based on statistical tendencies of real-world; Modulated gradient scaling, helps the model adjust the amount of the backpropagated gradient according to the arrangements of scene objects; Perceptual distillation improves the perceptual quality by overcoming the ill-posed multi-view inconsistency of image deblurring and distilling the pre-deblurred information, compensating for the lack of clean information in blurry images. We demonstrate the effectiveness of the Sparse-DeRF with extensive quantitative and qualitative experimental results by training DeRF from 2-view, 4-view, and 6-view blurry images.

Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View

TL;DR

The Sparse-DeRF successfully regularizes the complicated joint optimization of blur kernels and NeRF from sparse view, presenting alleviated overfitting artifacts and enhanced quality on radiance fields.

Abstract

Recent studies construct deblurred neural radiance fields~(DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available. This paper focuses on constructing DeRF from sparse-view for more pragmatic real-world scenarios. As observed in our experiments, establishing DeRF from sparse views proves to be a more challenging problem due to the inherent complexity arising from the simultaneous optimization of blur kernels and NeRF from sparse view. Sparse-DeRF successfully regularizes the complicated joint optimization, presenting alleviated overfitting artifacts and enhanced quality on radiance fields. The regularization consists of three key components: Surface smoothness, helps the model accurately predict the scene structure utilizing unseen and additional hidden rays derived from the blur kernel based on statistical tendencies of real-world; Modulated gradient scaling, helps the model adjust the amount of the backpropagated gradient according to the arrangements of scene objects; Perceptual distillation improves the perceptual quality by overcoming the ill-posed multi-view inconsistency of image deblurring and distilling the pre-deblurred information, compensating for the lack of clean information in blurry images. We demonstrate the effectiveness of the Sparse-DeRF with extensive quantitative and qualitative experimental results by training DeRF from 2-view, 4-view, and 6-view blurry images.
Paper Structure (41 sections, 25 equations, 16 figures, 28 tables)

This paper contains 41 sections, 25 equations, 16 figures, 28 tables.

Figures (16)

  • Figure 1: Simple illustration for different blur kernel modeling of (a) Deblur-NeRF ma2022deblurnerf and (b) DP-NeRF lee2023dpnerf. The main difference between the two kernels is the consistency between transformed rays induced from the blur kernel.
  • Figure 2: Overall architecture of the Sparse-DeRF. The main component of Spare-DeRF consists of three components, Surface Smoothness (SS), Modulated Gradient Scaling (MGS), and Perceptual Distillation (PD), which are denoted as (a), (b), (c) in the figure. Note that, NeRF network is shared to predict the color of each sampled points along the ray in hidden rays, integrated unobserved rays and rays for patch rendering.
  • Figure 3: Simple illustration of unobserved rays in our method, which consists of unseen rays and hidden rays. Two types of rays are independently defined. Unseen rays remain unchanged during training, but hidden rays change during the training since it is derived from the learned blur kernel. Note that, hidden rays can be derived from both types of blur kernel we utilized.
  • Figure 4: The comparison between ours modulated gradient scaling function with $\hat{J}(\delta_{\textbf{s}_{i}})$ and the previous function $J(\delta_{\textbf{s}_{i}})$ proposed by philip2023floatersnomore, with respect to the ray distance $\delta_{\textbf{s}_{i}}$. In the table, the values of $\delta_{\textbf{s}_{i}}$ and $min(1, J(\delta_{\textbf{s}_{i}}))$ on the x-axis and y-axis represent the ray distance from the camera origin and gradient scaling value, respectively. The $J(\delta_{\textbf{s}_{i}})$ of philip2023floatersnomore is represented as $x^2$ with the black colored line for clarity. The graphs illustrate that our modulated function $\hat{J}(\delta_{\textbf{s}_{i}})$ can cover the diversity in the arrangement of scene components, exhibiting various shapes of the function depending on magnitude $\rho$ and period $\eta$.
  • Figure 5: Illustration of our perceptual distillation. Perceptual distillation transfers the information of pre-deblurred texture by applying the perceptual loss to the pre-deblurred color patch $\bar{C}_{ptc}$ and rendered color patch $\hat{C}_{ptc}$, which is rendered from patch-wise sampled rays $\textbf{r}^{pd}_{ptc}$ in same pixel location. Note that $\Theta_{D}$ and $\mathcal{E}$ are pre-trained image deblurring network and a shared pre-trained image feature extractor, respectively.
  • ...and 11 more figures