Table of Contents
Fetching ...

Dual-Camera All-in-Focus Neural Radiance Fields

Xianrui Luo, Zijin Wu, Juewen Peng, Huiqiang Sun, Zhiguo Cao, Guosheng Lin

TL;DR

DC-NeRF addresses the problem of synthesizing all-in-focus neural radiance fields from smartphone imagery where main-camera views exhibit view-consistent defocus blur. It introduces a dual-camera pipeline that aligns the high-resolution, shallower-DoF main camera with the ultra-wide camera’s deep-DoF information, and uses a defocus-aware fusion to produce AiF novel views, guided by a learned defocus map and a blending mask. The method is validated on a smartphone dataset, outperforming state-of-the-art baselines in PSNR, SSIM, and LPIPS while enabling DoF applications like refocusing and split diopter. This work enables practical AiF NeRF from consumer devices by leveraging dual-camera hardware and a principled align-and-fuse strategy to recover sharp radiance fields under consistent blur.

Abstract

We present the first framework capable of synthesizing the all-in-focus neural radiance field (NeRF) from inputs without manual refocusing. Without refocusing, the camera will automatically focus on the fixed object for all views, and current NeRF methods typically using one camera fail due to the consistent defocus blur and a lack of sharp reference. To restore the all-in-focus NeRF, we introduce the dual-camera from smartphones, where the ultra-wide camera has a wider depth-of-field (DoF) and the main camera possesses a higher resolution. The dual camera pair saves the high-fidelity details from the main camera and uses the ultra-wide camera's deep DoF as reference for all-in-focus restoration. To this end, we first implement spatial warping and color matching to align the dual camera, followed by a defocus-aware fusion module with learnable defocus parameters to predict a defocus map and fuse the aligned camera pair. We also build a multi-view dataset that includes image pairs of the main and ultra-wide cameras in a smartphone. Extensive experiments on this dataset verify that our solution, termed DC-NeRF, can produce high-quality all-in-focus novel views and compares favorably against strong baselines quantitatively and qualitatively. We further show DoF applications of DC-NeRF with adjustable blur intensity and focal plane, including refocusing and split diopter.

Dual-Camera All-in-Focus Neural Radiance Fields

TL;DR

DC-NeRF addresses the problem of synthesizing all-in-focus neural radiance fields from smartphone imagery where main-camera views exhibit view-consistent defocus blur. It introduces a dual-camera pipeline that aligns the high-resolution, shallower-DoF main camera with the ultra-wide camera’s deep-DoF information, and uses a defocus-aware fusion to produce AiF novel views, guided by a learned defocus map and a blending mask. The method is validated on a smartphone dataset, outperforming state-of-the-art baselines in PSNR, SSIM, and LPIPS while enabling DoF applications like refocusing and split diopter. This work enables practical AiF NeRF from consumer devices by leveraging dual-camera hardware and a principled align-and-fuse strategy to recover sharp radiance fields under consistent blur.

Abstract

We present the first framework capable of synthesizing the all-in-focus neural radiance field (NeRF) from inputs without manual refocusing. Without refocusing, the camera will automatically focus on the fixed object for all views, and current NeRF methods typically using one camera fail due to the consistent defocus blur and a lack of sharp reference. To restore the all-in-focus NeRF, we introduce the dual-camera from smartphones, where the ultra-wide camera has a wider depth-of-field (DoF) and the main camera possesses a higher resolution. The dual camera pair saves the high-fidelity details from the main camera and uses the ultra-wide camera's deep DoF as reference for all-in-focus restoration. To this end, we first implement spatial warping and color matching to align the dual camera, followed by a defocus-aware fusion module with learnable defocus parameters to predict a defocus map and fuse the aligned camera pair. We also build a multi-view dataset that includes image pairs of the main and ultra-wide cameras in a smartphone. Extensive experiments on this dataset verify that our solution, termed DC-NeRF, can produce high-quality all-in-focus novel views and compares favorably against strong baselines quantitatively and qualitatively. We further show DoF applications of DC-NeRF with adjustable blur intensity and focal plane, including refocusing and split diopter.

Paper Structure

This paper contains 20 sections, 17 equations, 18 figures, 8 tables.

Figures (18)

  • Figure 1: We are the first to synthesize all-in-focus novel views from consistent defocus blur in smartphones. We show the sampled main/ultra-wide camera pair on the upper left, the first line is the main camera and the second is the ultra-wide one. We highlight the differences between the dual camera on the lower left. The two cameras exhibit significant visual differences in the field of view (indicated by the white box) and resolution. They also differ in several aspects, including variations in depth-of-field (the purple box) and the spatial and color misalignment shown in the cyan box (spatial relationship between the doll and the wooden case in the orange box). Compared with existing methods, our DC-NeRF is able to recover a sharp radiance field given a set of main camera images focused on the same target, with the help of a sub ultra-wide camera.
  • Figure 2: Our framework includes two modules: image alignment, and defocus-aware fusion. Homography and flow warping are implemented to align the main/ultra-wide image pair $I_m$ and $I_w$. Then we adjust the color of the warped ultra-wide image from histogram matching. To fuse the high-quality shot of the main camera view and the deep DoF information of the ultra-wide view, we propose a defocus-aware fusion network to estimate defocus parameters from bokeh rendering. The network then predicts a blending mask fuses the dual-camera view in volume rendering to generate the AiF novel views $I_{AiF}$.
  • Figure 3: An example view of the dataset. For each view, we capture a foreground-focused main image $I_{m}^{fg}$, and a background-focused $I_{m}^{bg}$. We simultaneously capture the ultra-wide image $I_w$ with deep DoF. The main camera focal stack is used to synthesize the AiF ground truth $I_{gt}$.
  • Figure 4: The advantage of the main camera on the ultra-wide camera. Although the ultra-wide camera has a larger DoF than the main camera, we still prefer the main camera for the focused area, because the main camera captures finer details due to sensor resolution.
  • Figure 5: The alignment between an ultra-wide image view $I_w$ and a main image view $I_m$. We use image registration to warp the ultra-wide $I_w$ from the image level, and the optical flow is applied to make up for parallax from different depth planes. Histogram matching is then used to match the color of the main view and the spatially warped ultra-wide view.
  • ...and 13 more figures