Table of Contents
Fetching ...

Sharp-NeRF: Grid-based Fast Deblurring Neural Radiance Fields Using Sharpness Prior

Byeonghyeon Lee, Howoong Lee, Usman Ali, Eunbyung Park

TL;DR

Sharp-NeRF targets the fidelity gap in NeRF under imperfect capture by introducing grid-based learnable blur kernels guided by a per-pixel sharpness prior. Built on a Tensorial Radiance Fields backbone, it replaces costly MLP-based kernel generation with discrete kernels indexed by a precomputed sharpness map and optimized via a photo-consistent reconstruction loss on blurred patches. The method leverages random patch sampling to model inter-pixel blur efficiently, achieving sharp, richly detailed renders with training times under 30 minutes and competitive or superior no-reference perceptual quality (Quantified by Niqe and Brisque) on real and synthetic defocus datasets. Extensive ablations show the sharpness prior, patch-based rendering, and grid kernels as key contributors to both quality and speed, while limitations exist for motion blur and other degradations, guiding future extensions. overall, Sharp-NeRF demonstrates that fully grid-based deblurring neural fields can rival prior approaches with substantial gains in training efficiency and perceptual quality.

Abstract

Neural Radiance Fields (NeRF) have shown remarkable performance in neural rendering-based novel view synthesis. However, NeRF suffers from severe visual quality degradation when the input images have been captured under imperfect conditions, such as poor illumination, defocus blurring, and lens aberrations. Especially, defocus blur is quite common in the images when they are normally captured using cameras. Although few recent studies have proposed to render sharp images of considerably high-quality, yet they still face many key challenges. In particular, those methods have employed a Multi-Layer Perceptron (MLP) based NeRF, which requires tremendous computational time. To overcome these shortcomings, this paper proposes a novel technique Sharp-NeRF -- a grid-based NeRF that renders clean and sharp images from the input blurry images within half an hour of training. To do so, we used several grid-based kernels to accurately model the sharpness/blurriness of the scene. The sharpness level of the pixels is computed to learn the spatially varying blur kernels. We have conducted experiments on the benchmarks consisting of blurry images and have evaluated full-reference and non-reference metrics. The qualitative and quantitative results have revealed that our approach renders the sharp novel views with vivid colors and fine details, and it has considerably faster training time than the previous works. Our project page is available at https://benhenryl.github.io/SharpNeRF/

Sharp-NeRF: Grid-based Fast Deblurring Neural Radiance Fields Using Sharpness Prior

TL;DR

Sharp-NeRF targets the fidelity gap in NeRF under imperfect capture by introducing grid-based learnable blur kernels guided by a per-pixel sharpness prior. Built on a Tensorial Radiance Fields backbone, it replaces costly MLP-based kernel generation with discrete kernels indexed by a precomputed sharpness map and optimized via a photo-consistent reconstruction loss on blurred patches. The method leverages random patch sampling to model inter-pixel blur efficiently, achieving sharp, richly detailed renders with training times under 30 minutes and competitive or superior no-reference perceptual quality (Quantified by Niqe and Brisque) on real and synthetic defocus datasets. Extensive ablations show the sharpness prior, patch-based rendering, and grid kernels as key contributors to both quality and speed, while limitations exist for motion blur and other degradations, guiding future extensions. overall, Sharp-NeRF demonstrates that fully grid-based deblurring neural fields can rival prior approaches with substantial gains in training efficiency and perceptual quality.

Abstract

Neural Radiance Fields (NeRF) have shown remarkable performance in neural rendering-based novel view synthesis. However, NeRF suffers from severe visual quality degradation when the input images have been captured under imperfect conditions, such as poor illumination, defocus blurring, and lens aberrations. Especially, defocus blur is quite common in the images when they are normally captured using cameras. Although few recent studies have proposed to render sharp images of considerably high-quality, yet they still face many key challenges. In particular, those methods have employed a Multi-Layer Perceptron (MLP) based NeRF, which requires tremendous computational time. To overcome these shortcomings, this paper proposes a novel technique Sharp-NeRF -- a grid-based NeRF that renders clean and sharp images from the input blurry images within half an hour of training. To do so, we used several grid-based kernels to accurately model the sharpness/blurriness of the scene. The sharpness level of the pixels is computed to learn the spatially varying blur kernels. We have conducted experiments on the benchmarks consisting of blurry images and have evaluated full-reference and non-reference metrics. The qualitative and quantitative results have revealed that our approach renders the sharp novel views with vivid colors and fine details, and it has considerably faster training time than the previous works. Our project page is available at https://benhenryl.github.io/SharpNeRF/
Paper Structure (26 sections, 7 equations, 8 figures, 10 tables)

This paper contains 26 sections, 7 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Comparison in terms of training time and image quality on the real defocus dataset. Left: Evaluated under full-reference metric (PSNR). Right: Evaluated under no-reference metric (Niqe).
  • Figure 2: The overall architecture of Sharp-NeRF. $\boxtimes$ stands for weighted sum. First, it computes defocus map of each training view using sharpness measure operator. Then the defocus map is quantized into $N_k$ values which is used as per-pixel sharpness level map $L$. This is a preprocess and is not required to compute $L$ druing process. During training, it takes ray patches as inputs to backbone neural fields model chen2022tensorf and render sharp and clean image $I_c$. $I_c$ is then cropped into several small patches $I_C'$ with stride 1 where size of each patch is $K \times K$. Preprocessed $L$ of the input ray patch is also given as input data and used to indexing blur kernels $\mathcal{B}$ to obtain per-pixel weight $w_{x}$. Subsequently, $w_{x}$ and $I'_c$ are weighted sum to render blurred image $I_b$. Note that at the time of inference, only modules in gray box are valid which means blur kernel is no longer required and only $I_c$ is used as a final rendered outcome.
  • Figure 3: Left: random ray sampling. Right: random patch sampling. Blue pixels are $P' \times P'$ interesting pixels to be rendered and skyblue pixels are required neighboring pixels for blur convolution.
  • Figure 4: Visualization of blur kernels. From left to right, the values of kernels are widely spread which implies that leftmost kernel is responsible for sharp region and rightmost kernel is responsible for blurry region.
  • Figure 5: Qualitative results on real defocus dataset. Our proposed method renders sharp images which have vivid colors and fine details.
  • ...and 3 more figures