Table of Contents
Fetching ...

Focal Split: Untethered Snapshot Depth from Differential Defocus

Junjie Luo, John Mamish, Alan Fu, Thomas Concannon, Josiah Hester, Emma Alexander, Qi Guo

TL;DR

Focal Split addresses the need for low-power, real-time depth sensing in dynamic scenes by combining a snapshot optomechanical design with a differentiated-defocus depth estimator. It captures two differently defocused images simultaneously on two sensors through a beamsplitter and computes per-pixel depth with a compact closed-form expression $Z = \frac{a}{b + \tilde{I}_s/\nabla^2 \tilde{I}}$, after aligning magnification. The handheld prototype runs on-board on a Raspberry Pi 5, draws about 4.9 W, and outputs 480×360 sparse depth maps at 2.1 FPS, with a working range of 0.4–1.2 m. The approach demonstrates motion-robust, untethered depth sensing at low computational cost and provides a DIY guide to enable broad adoption of snapshot, low-power depth cameras.

Abstract

We introduce Focal Split, a handheld, snapshot depth camera with fully onboard power and computing based on depth-from-differential-defocus (DfDD). Focal Split is passive, avoiding power consumption of light sources. Its achromatic optical system simultaneously forms two differentially defocused images of the scene, which can be independently captured using two photosensors in a snapshot. The data processing is based on the DfDD theory, which efficiently computes a depth and a confidence value for each pixel with only 500 floating point operations (FLOPs) per pixel from the camera measurements. We demonstrate a Focal Split prototype, which comprises a handheld custom camera system connected to a Raspberry Pi 5 for real-time data processing. The system consumes 4.9 W and is powered on a 5 V, 10,000 mAh battery. The prototype can measure objects with distances from 0.4 m to 1.2 m, outputting 480$\times$360 sparse depth maps at 2.1 frames per second (FPS) using unoptimized Python scripts. Focal Split is DIY friendly. A comprehensive guide to building your own Focal Split depth camera, code, and additional data can be found at https://focal-split.qiguo.org.

Focal Split: Untethered Snapshot Depth from Differential Defocus

TL;DR

Focal Split addresses the need for low-power, real-time depth sensing in dynamic scenes by combining a snapshot optomechanical design with a differentiated-defocus depth estimator. It captures two differently defocused images simultaneously on two sensors through a beamsplitter and computes per-pixel depth with a compact closed-form expression , after aligning magnification. The handheld prototype runs on-board on a Raspberry Pi 5, draws about 4.9 W, and outputs 480×360 sparse depth maps at 2.1 FPS, with a working range of 0.4–1.2 m. The approach demonstrates motion-robust, untethered depth sensing at low computational cost and provides a DIY guide to enable broad adoption of snapshot, low-power depth cameras.

Abstract

We introduce Focal Split, a handheld, snapshot depth camera with fully onboard power and computing based on depth-from-differential-defocus (DfDD). Focal Split is passive, avoiding power consumption of light sources. Its achromatic optical system simultaneously forms two differentially defocused images of the scene, which can be independently captured using two photosensors in a snapshot. The data processing is based on the DfDD theory, which efficiently computes a depth and a confidence value for each pixel with only 500 floating point operations (FLOPs) per pixel from the camera measurements. We demonstrate a Focal Split prototype, which comprises a handheld custom camera system connected to a Raspberry Pi 5 for real-time data processing. The system consumes 4.9 W and is powered on a 5 V, 10,000 mAh battery. The prototype can measure objects with distances from 0.4 m to 1.2 m, outputting 480360 sparse depth maps at 2.1 frames per second (FPS) using unoptimized Python scripts. Focal Split is DIY friendly. A comprehensive guide to building your own Focal Split depth camera, code, and additional data can be found at https://focal-split.qiguo.org.

Paper Structure

This paper contains 23 sections, 29 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Overview. (a) The principal eyes of jumping spiders comprise layered retinae, allowing the same scene to be imaged simultaneously at slightly different distances from the lens. This enables them to see two differentially defocused images of a target, from which depth can be estimated efficiently nagata2012depth. (b) Focal Split's novel optomechanical setup leverages a beamsplitter and two photosensors placed at different sensor distances to the lens to mimic the jumping spider's eye structures. (c) Our handheld, untethered Focal Split prototype can generate real-time sparse depth maps from battery-powered on-board computing.
  • Figure 2: The image formation model. The proposed algorithm calculates the derivative of the aligned images, $\tilde{I}_s$, as a cue for object depth (red arrow). In contrast, previous work alexander2019theory uses two derivatives (blue arrows), $I_s$ and $xI_x+yI_y$, to approximate the same quantity, resulting in higher computation and numerical instability.
  • Figure 3: Sensitivity analysis using synthetic data. We simulate the image pair, $I_1$ and $I_2$, of front parallel textured planes placed at different depths $Z$ and use the data to analyze the sensitivity of the algorithm. (a) Validation of the confidence metric. The overall depth estimation error, quantified by the mean absolute error (MAE), monotonically decreases as the normalized image derivative $|\tilde{I}_s/\max(\tilde{I}_s)|$ increases, suggesting the latter to be an effective confidence metric of the depth prediction. (b) Signal-to-noise ratio (SNR) of the estimated image derivatives $\tilde{I}_s$ and $\nabla^2\tilde{I}$ from finite difference. The vertical dashed line indicates the depth of the focal plane. The depth estimation becomes noisy when the SNR of both derivatives is too low. (c) Overall depth estimation error of using the proposed depth equation (Eq. \ref{['eq:focalsplit']}) vs. the previously suggested equation (Eq. \ref{['eq:old']}). The proposed one universally achieves higher accuracy for all sensor distance variation $\Delta s$. (d) Depth estimation error for Gaussian and Pillbox-shaped PSFs.
  • Figure 4: PSFs of the image pair, $I_1$ and $I_2$, at different depths using the assembled Focal Split prototype. The PSFs are measured by taking pictures of a white LED point source. The focal planes of $I_1$ and $I_2$ are approximately at 0.7 m and 1.2 m, respectively.
  • Figure 5: Quantitative analysis of the Focal Split prototype using real captured data. (a) Depth estimation accuracy at different confidence levels. The sparsity indicates the percentage of discarded, least-confident pixels. (b) Working range, defined as the depths where the MAE is smaller than $5\%$ of the true depth, as a function of confidence levels.
  • ...and 4 more figures