Table of Contents
Fetching ...

Adjust Your Focus: Defocus Deblurring From Dual-Pixel Images Using Explicit Multi-Scale Cross-Correlation

Kunal Swami

TL;DR

This work tackles defocus blur in photography by leveraging dual-pixel disparity cues through an explicit cross-correlation mechanism. It introduces MCCNet, a Siamese-encoder network with cross-correlation blocks and multi-scale fusion that guides deblurring regionally across the image. The model is trained with a Charbonnier plus MS-SSIM loss on the DPDD dataset and achieves state-of-the-art PSNR/SSIM with substantially fewer parameters and competitive FLOPS, as confirmed by ablations. The approach improves detail recovery in defocused regions and offers a computationally efficient solution for all-in-focus reconstruction from dual-pixel data.

Abstract

Defocus blur is a common problem in photography. It arises when an image is captured with a wide aperture, resulting in a shallow depth of field. Sometimes it is desired, e.g., in portrait effect. Otherwise, it is a problem from both an aesthetic point of view and downstream computer vision tasks, such as segmentation and depth estimation. Defocusing an out-of-focus image to obtain an all-in-focus image is a highly challenging and often ill-posed problem. A recent work exploited dual-pixel (DP) image information, widely available in consumer DSLRs and high-end smartphones, to solve the problem of defocus deblurring. DP sensors result in two sub-aperture views containing defocus disparity cues. A given pixel's disparity is directly proportional to the distance from the focal plane. However, the existing methods adopt a naïve approach of a channel-wise concatenation of the two DP views without explicitly utilizing the disparity cues within the network. In this work, we propose to perform an explicit cross-correlation between the two DP views to guide the network for appropriate deblurring in different image regions. We adopt multi-scale cross-correlation to handle blur and disparities at different scales. Quantitative and qualitative evaluation of our multi-scale cross-correlation network (MCCNet) reveals that it achieves better defocus deblurring than existing state-of-the-art methods despite having lesser computational complexity.

Adjust Your Focus: Defocus Deblurring From Dual-Pixel Images Using Explicit Multi-Scale Cross-Correlation

TL;DR

This work tackles defocus blur in photography by leveraging dual-pixel disparity cues through an explicit cross-correlation mechanism. It introduces MCCNet, a Siamese-encoder network with cross-correlation blocks and multi-scale fusion that guides deblurring regionally across the image. The model is trained with a Charbonnier plus MS-SSIM loss on the DPDD dataset and achieves state-of-the-art PSNR/SSIM with substantially fewer parameters and competitive FLOPS, as confirmed by ablations. The approach improves detail recovery in defocused regions and offers a computationally efficient solution for all-in-focus reconstruction from dual-pixel data.

Abstract

Defocus blur is a common problem in photography. It arises when an image is captured with a wide aperture, resulting in a shallow depth of field. Sometimes it is desired, e.g., in portrait effect. Otherwise, it is a problem from both an aesthetic point of view and downstream computer vision tasks, such as segmentation and depth estimation. Defocusing an out-of-focus image to obtain an all-in-focus image is a highly challenging and often ill-posed problem. A recent work exploited dual-pixel (DP) image information, widely available in consumer DSLRs and high-end smartphones, to solve the problem of defocus deblurring. DP sensors result in two sub-aperture views containing defocus disparity cues. A given pixel's disparity is directly proportional to the distance from the focal plane. However, the existing methods adopt a naïve approach of a channel-wise concatenation of the two DP views without explicitly utilizing the disparity cues within the network. In this work, we propose to perform an explicit cross-correlation between the two DP views to guide the network for appropriate deblurring in different image regions. We adopt multi-scale cross-correlation to handle blur and disparities at different scales. Quantitative and qualitative evaluation of our multi-scale cross-correlation network (MCCNet) reveals that it achieves better defocus deblurring than existing state-of-the-art methods despite having lesser computational complexity.

Paper Structure

This paper contains 17 sections, 1 equation, 8 figures, 2 tables.

Figures (8)

  • Figure 1: DP sensor image formation.
  • Figure 2: The architecture of the proposed MCCNet architecture. Note that the Cross-correlation module outputs both left and right features. The two Multi-scale Fusion modules (one each for left and right features) share parameters.
  • Figure 3: This figure shows how a given pixel in the left DP image is cross-correlated with pixels in the right DP image.
  • Figure 4: This figure describes the Cross-correlation module used in this work for explicit cross-correlation between the left and right DP images.
  • Figure 5: This figure describes the Multi-scale Feature Extraction module.
  • ...and 3 more figures