Table of Contents
Fetching ...

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

Yunhao Li, Xiaodong Wang, Ping Wang, Xin Yuan, Peidong Liu

TL;DR

This work tackles recovering 3D scene representations from a single snapshot captured by Snapshot Compressive Imaging. It introduces SCINeRF, which uses neural radiance fields as the underlying 3D representation and jointly optimizes NeRF parameters along with camera poses to fit the compressed SCI measurement produced by mask-based temporal coding. By modeling the SCI image formation and performing test-time optimization, SCINeRF delivers high-quality image restoration and high-fidelity novel-view synthesis, outperforming state-of-the-art SCI methods on both synthetic and real data. The approach enables high-frame-rate, multi-view rendering from a single exposure and offers privacy-friendly, edge-friendly deployment opportunities alongside practical NeRF-based workflows.

Abstract

In this paper, we explore the potential of Snapshot Compressive Imaging (SCI) technique for recovering the underlying 3D scene representation from a single temporal compressed image. SCI is a cost-effective method that enables the recording of high-dimensional data, such as hyperspectral or temporal information, into a single image using low-cost 2D imaging sensors. To achieve this, a series of specially designed 2D masks are usually employed, which not only reduces storage requirements but also offers potential privacy protection. Inspired by this, to take one step further, our approach builds upon the powerful 3D scene representation capabilities of neural radiance fields (NeRF). Specifically, we formulate the physical imaging process of SCI as part of the training of NeRF, allowing us to exploit its impressive performance in capturing complex scene structures. To assess the effectiveness of our method, we conduct extensive evaluations using both synthetic data and real data captured by our SCI system. Extensive experimental results demonstrate that our proposed approach surpasses the state-of-the-art methods in terms of image reconstruction and novel view image synthesis. Moreover, our method also exhibits the ability to restore high frame-rate multi-view consistent images by leveraging SCI and the rendering capabilities of NeRF. The code is available at https://github.com/WU-CVGL/SCINeRF.

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

TL;DR

This work tackles recovering 3D scene representations from a single snapshot captured by Snapshot Compressive Imaging. It introduces SCINeRF, which uses neural radiance fields as the underlying 3D representation and jointly optimizes NeRF parameters along with camera poses to fit the compressed SCI measurement produced by mask-based temporal coding. By modeling the SCI image formation and performing test-time optimization, SCINeRF delivers high-quality image restoration and high-fidelity novel-view synthesis, outperforming state-of-the-art SCI methods on both synthetic and real data. The approach enables high-frame-rate, multi-view rendering from a single exposure and offers privacy-friendly, edge-friendly deployment opportunities alongside practical NeRF-based workflows.

Abstract

In this paper, we explore the potential of Snapshot Compressive Imaging (SCI) technique for recovering the underlying 3D scene representation from a single temporal compressed image. SCI is a cost-effective method that enables the recording of high-dimensional data, such as hyperspectral or temporal information, into a single image using low-cost 2D imaging sensors. To achieve this, a series of specially designed 2D masks are usually employed, which not only reduces storage requirements but also offers potential privacy protection. Inspired by this, to take one step further, our approach builds upon the powerful 3D scene representation capabilities of neural radiance fields (NeRF). Specifically, we formulate the physical imaging process of SCI as part of the training of NeRF, allowing us to exploit its impressive performance in capturing complex scene structures. To assess the effectiveness of our method, we conduct extensive evaluations using both synthetic data and real data captured by our SCI system. Extensive experimental results demonstrate that our proposed approach surpasses the state-of-the-art methods in terms of image reconstruction and novel view image synthesis. Moreover, our method also exhibits the ability to restore high frame-rate multi-view consistent images by leveraging SCI and the rendering capabilities of NeRF. The code is available at https://github.com/WU-CVGL/SCINeRF.
Paper Structure (10 sections, 6 equations, 5 figures, 4 tables)

This paper contains 10 sections, 6 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Given a single snapshot compressed image, our method is able to recover the underlying 3D scene representation. Leveraging the strong novel-view image synthesis capabilities of NeRF, we can render multi-view consistent images in high quality from the single measurement.
  • Figure 2: Overview of the proposed SCINeRF. Our method takes a single snapshot compressed image and corresponding masks as input, and recovers the underlying 3D scene representation as well as the camera motion trajectory within a single exposure time.
  • Figure 3: Experimental setup for real dataset collection. This SCI imaging system contains a CCD camera to record snapshot measurement, primary and rely lens, and a DMD to modulate input frames.
  • Figure 4: Qualitative evaluations of our method against SOTA SCI image restoration methods on the synthetic dataset. Top to bottom shows the results for different scenes, including Cozy2room, Tanabata, Factory and Vender. The experimental results demonstrate that our method achieves superior performance on image restoration from a single compressed image (the far-left column).
  • Figure 5: Qualitative evaluations of our method against SOTA SCI image restoration methods on the real dataset capptured by our system in Fig. \ref{['experimental setup']}. Top to bottom shows the results for different scenes. Since the compressed ground truth images in real datasets are unavailable, we capture separate scene images after capture the snapshot compressed image used for reference. For qualitative evaluation purpose, we render images from the learned 3D scene representations by SCINeRF. The results demonstrate that our SCINeRF surpasses existing image restoration methods on real datasets.