Table of Contents
Fetching ...

Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus

Zhuofeng Wu, Yusuke Monno, Masatoshi Okutomi

TL;DR

This work tackles depth-from-defocus under spatially varying aberrations by learning PSFs without ground-truth PSFs through self-supervision on real sharp/blurred image pairs captured via aperture changes. It introduces a polar-coordinate PSF model that exploits rotational symmetry, parameterized by image height $IH$ and polar angle $ heta$, and handles focus breathing by training PSFs per focus distance $f_d$. The estimated spatially variant PSFs are used to synthesize focal stacks for training DfD networks, which receive an image-height map as additional input to account for PSF variation, improving depth estimation on real data. Across synthetic and real experiments, the approach yields PSF estimates competitive with supervised baselines and enhances depth-from-defocus performance, enabling aberration-aware depth sensing with real cameras.

Abstract

In this paper, we address the task of aberration-aware depth-from-defocus (DfD), which takes account of spatially variant point spread functions (PSFs) of a real camera. To effectively obtain the spatially variant PSFs of a real camera without requiring any ground-truth PSFs, we propose a novel self-supervised learning method that leverages the pair of real sharp and blurred images, which can be easily captured by changing the aperture setting of the camera. In our PSF estimation, we assume rotationally symmetric PSFs and introduce the polar coordinate system to more accurately learn the PSF estimation network. We also handle the focus breathing phenomenon that occurs in real DfD situations. Experimental results on synthetic and real data demonstrate the effectiveness of our method regarding both the PSF estimation and the depth estimation.

Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus

TL;DR

This work tackles depth-from-defocus under spatially varying aberrations by learning PSFs without ground-truth PSFs through self-supervision on real sharp/blurred image pairs captured via aperture changes. It introduces a polar-coordinate PSF model that exploits rotational symmetry, parameterized by image height and polar angle , and handles focus breathing by training PSFs per focus distance . The estimated spatially variant PSFs are used to synthesize focal stacks for training DfD networks, which receive an image-height map as additional input to account for PSF variation, improving depth estimation on real data. Across synthetic and real experiments, the approach yields PSF estimates competitive with supervised baselines and enhances depth-from-defocus performance, enabling aberration-aware depth sensing with real cameras.

Abstract

In this paper, we address the task of aberration-aware depth-from-defocus (DfD), which takes account of spatially variant point spread functions (PSFs) of a real camera. To effectively obtain the spatially variant PSFs of a real camera without requiring any ground-truth PSFs, we propose a novel self-supervised learning method that leverages the pair of real sharp and blurred images, which can be easily captured by changing the aperture setting of the camera. In our PSF estimation, we assume rotationally symmetric PSFs and introduce the polar coordinate system to more accurately learn the PSF estimation network. We also handle the focus breathing phenomenon that occurs in real DfD situations. Experimental results on synthetic and real data demonstrate the effectiveness of our method regarding both the PSF estimation and the depth estimation.
Paper Structure (11 sections, 5 equations, 6 figures, 3 tables)

This paper contains 11 sections, 5 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The outline of this work. (a) We propose a novel self-supervised learning method for estimating spatially variant PSFs for each focus distance of the focal stack based on the pair of real sharp and blurred images. The obtained PSFs are used to generate synthetic focal stack images for depth-from-defocus (DfD) network training. (b) We train a DfD network with additional image height map information to make the network learn the depth map considering the spatial variance of the PSFs.
  • Figure 2: The overall flow of our self-supervised PSF-Net training.
  • Figure 3: Examples of the image misalignments across the real focal stack images caused by the focus breathing phenomenon.
  • Figure 4: Visual comparison of the PSF estimation results ($\theta=45^\circ$) on synthetic data.
  • Figure 5: PSF estimation results ($\theta=0^\circ$) using a real Olympus camera for the depth of 1.0 meters.
  • ...and 1 more figures