Adjust Your Focus: Defocus Deblurring From Dual-Pixel Images Using Explicit Multi-Scale Cross-Correlation
Kunal Swami
TL;DR
This work tackles defocus blur in photography by leveraging dual-pixel disparity cues through an explicit cross-correlation mechanism. It introduces MCCNet, a Siamese-encoder network with cross-correlation blocks and multi-scale fusion that guides deblurring regionally across the image. The model is trained with a Charbonnier plus MS-SSIM loss on the DPDD dataset and achieves state-of-the-art PSNR/SSIM with substantially fewer parameters and competitive FLOPS, as confirmed by ablations. The approach improves detail recovery in defocused regions and offers a computationally efficient solution for all-in-focus reconstruction from dual-pixel data.
Abstract
Defocus blur is a common problem in photography. It arises when an image is captured with a wide aperture, resulting in a shallow depth of field. Sometimes it is desired, e.g., in portrait effect. Otherwise, it is a problem from both an aesthetic point of view and downstream computer vision tasks, such as segmentation and depth estimation. Defocusing an out-of-focus image to obtain an all-in-focus image is a highly challenging and often ill-posed problem. A recent work exploited dual-pixel (DP) image information, widely available in consumer DSLRs and high-end smartphones, to solve the problem of defocus deblurring. DP sensors result in two sub-aperture views containing defocus disparity cues. A given pixel's disparity is directly proportional to the distance from the focal plane. However, the existing methods adopt a naïve approach of a channel-wise concatenation of the two DP views without explicitly utilizing the disparity cues within the network. In this work, we propose to perform an explicit cross-correlation between the two DP views to guide the network for appropriate deblurring in different image regions. We adopt multi-scale cross-correlation to handle blur and disparities at different scales. Quantitative and qualitative evaluation of our multi-scale cross-correlation network (MCCNet) reveals that it achieves better defocus deblurring than existing state-of-the-art methods despite having lesser computational complexity.
