Table of Contents
Fetching ...

FoveaSPAD: Exploiting Depth Priors for Adaptive and Efficient Single-Photon 3D Imaging

Justin Folden, Atul Ingle, Sanjeev J. Koppal

TL;DR

FoveaSPAD targets the data bottlenecks and ambient-light vulnerability of SPAD LiDAR by introducing depth-prior–driven foveation that adaptively gates histogram data during capture. The approach leverages monocular, optical-flow, or coarse hardware priors to steer memory and depth foveation, substantially reducing memory and bandwidth while maintaining or improving depth accuracy. The authors provide a theoretical imaging model, SNR/SBR analyses under low and high ambient light, and demonstrate substantial memory savings (up to $1548$-fold in some scenarios) through simulations and hardware emulation, with practical considerations for future SPAD hardware (e.g., macropixels, tunable TDCs). This work offers a pathway to efficient, robust SPAD-based depth sensing suitable for resource-constrained platforms like robotics and autonomous systems.

Abstract

Fast, efficient, and accurate depth-sensing is important for safety-critical applications such as autonomous vehicles. Direct time-of-flight LiDAR has the potential to fulfill these demands, thanks to its ability to provide high-precision depth measurements at long standoff distances. While conventional LiDAR relies on avalanche photodiodes (APDs), single-photon avalanche diodes (SPADs) are an emerging image-sensing technology that offer many advantages such as extreme sensitivity and time resolution. In this paper, we remove the key challenges to widespread adoption of SPAD-based LiDARs: their susceptibility to ambient light and the large amount of raw photon data that must be processed to obtain in-pixel depth estimates. We propose new algorithms and sensing policies that improve signal-to-noise ratio (SNR) and increase computing and memory efficiency for SPAD-based LiDARs. During capture, we use external signals to \emph{foveate}, i.e., guide how the SPAD system estimates scene depths. This foveated approach allows our method to ``zoom into'' the signal of interest, reducing the amount of raw photon data that needs to be stored and transferred from the SPAD sensor, while also improving resilience to ambient light. We show results both in simulation and also with real hardware emulation, with specific implementations achieving a 1548-fold reduction in memory usage, and our algorithms can be applied to newly available and future SPAD arrays.

FoveaSPAD: Exploiting Depth Priors for Adaptive and Efficient Single-Photon 3D Imaging

TL;DR

FoveaSPAD targets the data bottlenecks and ambient-light vulnerability of SPAD LiDAR by introducing depth-prior–driven foveation that adaptively gates histogram data during capture. The approach leverages monocular, optical-flow, or coarse hardware priors to steer memory and depth foveation, substantially reducing memory and bandwidth while maintaining or improving depth accuracy. The authors provide a theoretical imaging model, SNR/SBR analyses under low and high ambient light, and demonstrate substantial memory savings (up to -fold in some scenarios) through simulations and hardware emulation, with practical considerations for future SPAD hardware (e.g., macropixels, tunable TDCs). This work offers a pathway to efficient, robust SPAD-based depth sensing suitable for resource-constrained platforms like robotics and autonomous systems.

Abstract

Fast, efficient, and accurate depth-sensing is important for safety-critical applications such as autonomous vehicles. Direct time-of-flight LiDAR has the potential to fulfill these demands, thanks to its ability to provide high-precision depth measurements at long standoff distances. While conventional LiDAR relies on avalanche photodiodes (APDs), single-photon avalanche diodes (SPADs) are an emerging image-sensing technology that offer many advantages such as extreme sensitivity and time resolution. In this paper, we remove the key challenges to widespread adoption of SPAD-based LiDARs: their susceptibility to ambient light and the large amount of raw photon data that must be processed to obtain in-pixel depth estimates. We propose new algorithms and sensing policies that improve signal-to-noise ratio (SNR) and increase computing and memory efficiency for SPAD-based LiDARs. During capture, we use external signals to \emph{foveate}, i.e., guide how the SPAD system estimates scene depths. This foveated approach allows our method to ``zoom into'' the signal of interest, reducing the amount of raw photon data that needs to be stored and transferred from the SPAD sensor, while also improving resilience to ambient light. We show results both in simulation and also with real hardware emulation, with specific implementations achieving a 1548-fold reduction in memory usage, and our algorithms can be applied to newly available and future SPAD arrays.

Paper Structure

This paper contains 17 sections, 10 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Depth Prior Driven SPAD Depth Foveation: SPAD sensors suffer from a data bottleneck, since thousands of histogram bins are used to generate depth as shown in the top left. If fewer bins are used, this reduces depth resolution, as shown in the limited bins depth result. Our idea is to use additional information, such as a color image (Sec. \ref{['sec:4monofovea']}, \ref{['sec:hardware']}) or optical flow (Sec. \ref{['sec:6opflow']}), to foveate the SPAD bins. Therefore, for the same memory cost we can place the bins near where the histogram peak should be, results in accurate depth, as shown in the depth foveation result. The insets show that our method achieves the accuracy and resolution of ground truth, with fewer bins. They also show that the depth prior, in this case monocular estimation, by itself cannot provide the correct depth, and foveation is required.
  • Figure 2: Qualitative Comparison on NYUv2 Our memory and depth foveation techniques produce quality depth reconstructions with a fraction of the memory usage. Each row consists of the NYUv2 ground truth images, the monocular depth output from ZoeDepth, a simulated SPAD output with N$^\prime$ bins, and our foveation techniques. The rows show different combinations of M and N$^\prime$, where M is the number of bins in the foveated histograms, and N$^\prime$ is the limited number of bins used for depth foveation. Monocular estimation is just one method of obtaining a depth prior in a class of methods, in sec. \ref{['sec:6opflow']} and sec. \ref{['sec:hardware']} we show two more methods.
  • Figure 3: Spatio-temporal foveation The first two columns display the scene's color and ground truth depth. Using the quantized monocular depth in the third column, we select certain pixels in the fourth column. Processing only histograms at these locations with foveated windows generates results in the last column, indicating a 1548-fold reduction in memory usage. This is calculated by measuring memory allocation for full-res and spatio-temporal histograms. The results shown are with M=1/16N and N$^\prime$ = 16
  • Figure 4: Optical Flow Driven Foveation Here we see our optical flow driven SPAD foveation using the Carla simulator whose color and ground-truth depth are shown in the first two columns. Directly using optical flow, as shown in the third column, creates errors that propagate over time. We correct for the optical flow error by detecting those pixels whose foveated windows are close to the noise floor. The last column shows the final optical flow driven foveated depth at different window sizes. Please see the supplementary for video results.
  • Figure 5: Hardware emulation results for scenes from Lindell et al. lindell2018single. (Column 1) The Lindell dataset consists of monochrome images captured by a camera co-aligned with the SPAD sensor that captures photon data cubes. (Column 2) We obtain monocular depth maps from these monochrome images. (Column 3) Raw photon data cube without foveation shows a "cloud" of background photon detections. (Column 4) Maxima detection on low SBR photon clouds leads to unusable depth maps. (Column 5) The CNN-based algorithm of Lindell et al. improves depth map reconstruction. (Column 6) Our approach relies on memory foveation in a 1/4th size sub-window around an estimate of the true depth obtained from monocular depth maps. Observe that the photon data cubes are less noisy. (Column 7) Even a simple max-estimator provides better depth map estimates after foveation. (Column 8) Providing foveated clouds to the CNN denoiser of Lindell et al. further improves reconstructions.
  • ...and 5 more figures