Table of Contents
Fetching ...

Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging

In Cho, Hyunbo Shim, Seon Joo Kim

TL;DR

LEAP tackles practical NLOS imaging challenges by combining a denoising autoencoder in the measurement space with a phasor-domain supervision that restricts learning to the band of informative frequencies around the central illumination frequency $\\Omega_C$. The method predicts clean, full aperture phasor fields $\\mathcal{P}_{\\mathcal{F}}(\\mathbf{x_c},\\Omega)$ from noisy partial inputs, and uses 2D FFT-based Rayleigh-Sommerfeld diffraction to reconstruct hidden scenes. Across synthetic and real-world benchmarks, LEAP supports 16× to 64× fewer samplings and 4× smaller apertures while delivering higher fidelity reconstructions than strong baselines, with runtime on the order of tens of milliseconds. This phasor-based frequency management enables practical, hardware-efficient NLOS imaging suitable for real-time or near-real-time applications.

Abstract

This paper aims to facilitate more practical NLOS imaging by reducing the number of samplings and scan areas. To this end, we introduce a phasor-based enhancement network that is capable of predicting clean and full measurements from noisy partial observations. We leverage a denoising autoencoder scheme to acquire rich and noise-robust representations in the measurement space. Through this pipeline, our enhancement network is trained to accurately reconstruct complete measurements from their corrupted and partial counterparts. However, we observe that the \naive application of denoising often yields degraded and over-smoothed results, caused by unnecessary and spurious frequency signals present in measurements. To address this issue, we introduce a phasor-based pipeline designed to limit the spectrum of our network to the frequency range of interests, where the majority of informative signals are detected. The phasor wavefronts at the aperture, which are band-limited signals, are employed as inputs and outputs of the network, guiding our network to learn from the frequency range of interests and discard unnecessary information. The experimental results in more practical acquisition scenarios demonstrate that we can look around the corners with $16\times$ or $64\times$ fewer samplings and $4\times$ smaller apertures. Our code is available at https://github.com/join16/LEAP.

Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging

TL;DR

LEAP tackles practical NLOS imaging challenges by combining a denoising autoencoder in the measurement space with a phasor-domain supervision that restricts learning to the band of informative frequencies around the central illumination frequency . The method predicts clean, full aperture phasor fields from noisy partial inputs, and uses 2D FFT-based Rayleigh-Sommerfeld diffraction to reconstruct hidden scenes. Across synthetic and real-world benchmarks, LEAP supports 16× to 64× fewer samplings and 4× smaller apertures while delivering higher fidelity reconstructions than strong baselines, with runtime on the order of tens of milliseconds. This phasor-based frequency management enables practical, hardware-efficient NLOS imaging suitable for real-time or near-real-time applications.

Abstract

This paper aims to facilitate more practical NLOS imaging by reducing the number of samplings and scan areas. To this end, we introduce a phasor-based enhancement network that is capable of predicting clean and full measurements from noisy partial observations. We leverage a denoising autoencoder scheme to acquire rich and noise-robust representations in the measurement space. Through this pipeline, our enhancement network is trained to accurately reconstruct complete measurements from their corrupted and partial counterparts. However, we observe that the \naive application of denoising often yields degraded and over-smoothed results, caused by unnecessary and spurious frequency signals present in measurements. To address this issue, we introduce a phasor-based pipeline designed to limit the spectrum of our network to the frequency range of interests, where the majority of informative signals are detected. The phasor wavefronts at the aperture, which are band-limited signals, are employed as inputs and outputs of the network, guiding our network to learn from the frequency range of interests and discard unnecessary information. The experimental results in more practical acquisition scenarios demonstrate that we can look around the corners with or fewer samplings and smaller apertures. Our code is available at https://github.com/join16/LEAP.
Paper Structure (59 sections, 11 equations, 23 figures, 9 tables)

This paper contains 59 sections, 11 equations, 23 figures, 9 tables.

Figures (23)

  • Figure 1: (a) A typical NLOS imaging system. (b) More practical acquisition scenarios of NLOS imaging: sparse sampling and scanning with smaller apertures. (c) Results on confocal $16 \times 16$ measurements of Bike lindell2019fk. Our method exhibits high-quality results with $16 \times$ fewer sampling points and a shorter acquisition time, whereas previous signal recovery network (SSN) wang2023ssn and simple addition of the denoising criterion to SSN (SSN+) fail to correctly reconstruct the hidden objects.
  • Figure 2: (left) The illumination function in the frequency domain (top), and amplitudes of measurements of the Stanford bunny, both clean and with Poisson noise (bottom). Signals at the center pixel are visualized. (right) Reconstruction results of FK lindell2019fk on frequency-filtered measurements. Informative signals are mostly observed in a certain frequency range, whereas the Poisson noise affects across the entire spectrum hernandez2017spad.
  • Figure 3: The overview of the proposed LEAP. Our model takes noisy partial measurements and learns to predict clean and complete phasor wavefronts at the aperture. Hidden scenes are reconstructed by propagating the predicted phasor field with RSD liu2020diffraction.
  • Figure 4: Qualitative results on Bike, Dragon from the Stanford real-world dataset lindell2019fk. We report the results of FK, LCT, and RSD with the nearest interpolation. Evaluation scenarios involve $16 \times 16$ and $8 \times 8$ sparse samplings with $2\ \textrm{m} \times 2\ \textrm{m}$ apertures, and the $1\ \textrm{m} \times 1\ \textrm{m}$ smaller aperture with $16 \times 16$ samplings.
  • Figure 5: Qualitative results of non-confocal $16 \times 16$ sparse samplings, on Resolution of the real-world dataset liu2020diffraction. All methods employing signal recovery networks produce plausible results with sufficiently long exposure time and white diffuse objects.
  • ...and 18 more figures