Table of Contents
Fetching ...

Passive Non-Line-of-Sight Imaging with Light Transport Modulation

Jiarui Zhang, Ruixu Geng, Xiaolong Du, Yan Chen, Houqiang Li, Yang Hu

TL;DR

Passive non-line-of-sight imaging under varying light-transport conditions is challenging due to an ill-posed forward model $\mathbf{y}=\mathbf{A}\mathbf{x}+\mathbf{n}$ with a large $\kappa(\mathbf{A})$. The authors propose NLOS-LTM, a unified network that learns a latent light-transport representation from the projection image and modulates both reconstruction and reprojection networks through multi-scale light-transport modulation blocks. A discrete transport representation is learned with a light-transport encoder and vector quantization over a codebook, trained with an $L_{VQ}$ loss and a joint projection-reconstruction objective. Experiments on large-scale multi-condition datasets show substantial gains over condition-specific models and standard restoration baselines, with improved generalization and the ability to synthesize projection data via a reprojection network.

Abstract

Passive non-line-of-sight (NLOS) imaging has witnessed rapid development in recent years, due to its ability to image objects that are out of sight. The light transport condition plays an important role in this task since changing the conditions will lead to different imaging models. Existing learning-based NLOS methods usually train independent models for different light transport conditions, which is computationally inefficient and impairs the practicality of the models. In this work, we propose NLOS-LTM, a novel passive NLOS imaging method that effectively handles multiple light transport conditions with a single network. We achieve this by inferring a latent light transport representation from the projection image and using this representation to modulate the network that reconstructs the hidden image from the projection image. We train a light transport encoder together with a vector quantizer to obtain the light transport representation. To further regulate this representation, we jointly learn both the reconstruction network and the reprojection network during training. A set of light transport modulation blocks is used to modulate the two jointly trained networks in a multi-scale way. Extensive experiments on a large-scale passive NLOS dataset demonstrate the superiority of the proposed method. The code is available at https://github.com/JerryOctopus/NLOS-LTM.

Passive Non-Line-of-Sight Imaging with Light Transport Modulation

TL;DR

Passive non-line-of-sight imaging under varying light-transport conditions is challenging due to an ill-posed forward model with a large . The authors propose NLOS-LTM, a unified network that learns a latent light-transport representation from the projection image and modulates both reconstruction and reprojection networks through multi-scale light-transport modulation blocks. A discrete transport representation is learned with a light-transport encoder and vector quantization over a codebook, trained with an loss and a joint projection-reconstruction objective. Experiments on large-scale multi-condition datasets show substantial gains over condition-specific models and standard restoration baselines, with improved generalization and the ability to synthesize projection data via a reprojection network.

Abstract

Passive non-line-of-sight (NLOS) imaging has witnessed rapid development in recent years, due to its ability to image objects that are out of sight. The light transport condition plays an important role in this task since changing the conditions will lead to different imaging models. Existing learning-based NLOS methods usually train independent models for different light transport conditions, which is computationally inefficient and impairs the practicality of the models. In this work, we propose NLOS-LTM, a novel passive NLOS imaging method that effectively handles multiple light transport conditions with a single network. We achieve this by inferring a latent light transport representation from the projection image and using this representation to modulate the network that reconstructs the hidden image from the projection image. We train a light transport encoder together with a vector quantizer to obtain the light transport representation. To further regulate this representation, we jointly learn both the reconstruction network and the reprojection network during training. A set of light transport modulation blocks is used to modulate the two jointly trained networks in a multi-scale way. Extensive experiments on a large-scale passive NLOS dataset demonstrate the superiority of the proposed method. The code is available at https://github.com/JerryOctopus/NLOS-LTM.
Paper Structure (21 sections, 8 equations, 10 figures, 4 tables)

This paper contains 21 sections, 8 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Top row: Passive non-line-of-sight imaging setting. Light emitted from the hidden image projects on the relay surface and is then captured by a camera. Bottom row: Comparison of three different solutions for NLOS imaging under multiple light transport conditions. (a) Train a condition specific network for each light transport condition; (b) Train a condition agnostic network without explicitly considering the light transport conditions; (c) Train a unified network for different light transport conditions, which uses a learned light transport representation to modulate the reconstruction network.
  • Figure 2: Architecture of the proposed NLOS-LTM method for passive NLOS imaging. A reprojection network $G_p$ that reprojects the hidden image to the projection image, and a reconstruction network $G_r$ that reconstructs the hidden image from the projection image are jointly learned during training. We use an encoder $E_c$ and a vector quantizer $Q_c$ to obtain a latent representation of the light transport condition associated with the projection image, which is then used to modulate the feature maps of the reprojection and the reconstruction networks through a set of light transport modulation (LTM) blocks. We also pretrain an auto-encoder, which consists of $E_h$ and $D_h$, with the hidden images. The decoder $D_h$ of the auto-encoder is taken as the decoder part of $G_r$. During testing, only $E_c$, $Q_c$ and $G_r$ are needed for NLOS reconstruction.
  • Figure 3: Detailed structure of the light transport modulation (LTM) block. $\mathbf{O}_{i}$ is the upsampled light transport representation of different scales. $\mathbf{F}_{i}$ is the input feature map. $\mathbf{t}_{s,i}$ and $\mathbf{t}_{b,i}$ are the scaling and translation parameters respectively. The LTM block modulates the input feature map with a normalization style operation.
  • Figure 4: Visual comparison of the results on the All-MNIST dataset under different light transport conditions.
  • Figure 5: Visual comparison of the results on the All-Supermodel dataset under different light transport conditions.
  • ...and 5 more figures