Table of Contents
Fetching ...

SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream

Lin Zhu, Kangmin Jia, Yifan Zhao, Yunshan Qi, Lizhi Wang, Hua Huang

TL;DR

The first work that derives a NeRF-based volumetric scene representation from spike camera data, and describes how to effectively optimize neural radiance fields to render photorealistic novel views from the novel continuous spike stream, demonstrating advantages over other vision sen-sors in certain scenes.

Abstract

Spike cameras, leveraging spike-based integration sampling and high temporal resolution, offer distinct advantages over standard cameras. However, existing approaches reliant on spike cameras often assume optimal illumination, a condition frequently unmet in real-world scenarios. To address this, we introduce SpikeNeRF, the first work that derives a NeRF-based volumetric scene representation from spike camera data. Our approach leverages NeRF's multi-view consistency to establish robust self-supervision, effectively eliminating erroneous measurements and uncovering coherent structures within exceedingly noisy input amidst diverse real-world illumination scenarios. The framework comprises two core elements: a spike generation model incorporating an integrate-and-fire neuron layer and parameters accounting for non-idealities, such as threshold variation, and a spike rendering loss capable of generalizing across varying illumination conditions. We describe how to effectively optimize neural radiance fields to render photorealistic novel views from the novel continuous spike stream, demonstrating advantages over other vision sensors in certain scenes. Empirical evaluations conducted on both real and novel realistically simulated sequences affirm the efficacy of our methodology. The dataset and source code are released at https://github.com/BIT-Vision/SpikeNeRF.

SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream

TL;DR

The first work that derives a NeRF-based volumetric scene representation from spike camera data, and describes how to effectively optimize neural radiance fields to render photorealistic novel views from the novel continuous spike stream, demonstrating advantages over other vision sen-sors in certain scenes.

Abstract

Spike cameras, leveraging spike-based integration sampling and high temporal resolution, offer distinct advantages over standard cameras. However, existing approaches reliant on spike cameras often assume optimal illumination, a condition frequently unmet in real-world scenarios. To address this, we introduce SpikeNeRF, the first work that derives a NeRF-based volumetric scene representation from spike camera data. Our approach leverages NeRF's multi-view consistency to establish robust self-supervision, effectively eliminating erroneous measurements and uncovering coherent structures within exceedingly noisy input amidst diverse real-world illumination scenarios. The framework comprises two core elements: a spike generation model incorporating an integrate-and-fire neuron layer and parameters accounting for non-idealities, such as threshold variation, and a spike rendering loss capable of generalizing across varying illumination conditions. We describe how to effectively optimize neural radiance fields to render photorealistic novel views from the novel continuous spike stream, demonstrating advantages over other vision sensors in certain scenes. Empirical evaluations conducted on both real and novel realistically simulated sequences affirm the efficacy of our methodology. The dataset and source code are released at https://github.com/BIT-Vision/SpikeNeRF.
Paper Structure (23 sections, 16 equations, 11 figures, 10 tables)

This paper contains 23 sections, 16 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: Comparing novel views across various vision sensors, our SpikeNeRF stands out as the first method to learn neural radiance fields from a continuous spike stream. The Spike camera, operating at 20,000 Hz, eliminates motion blur, distinguishing it from traditional cameras. Notably, when contrasted with other methods (e.g., TFI+NeRF recon, Spk2img+NeRF zhao2021spk2imgnet, and EventNeRF enerf2) and sensors (e.g., event camera), the rendered views of objects or scenes exhibit significantly enhanced sharpness.
  • Figure 2: The architecture of SpikeNeRF. Motivated by the objective of learning NeRFs from a continuous spike stream, we establish the connection between the pixel ray $r$ and the real-world spike stream $S$. To quantify the rendering loss in the spike domain, we integrate a spiking neuron layer following the NeRF MLP. The nonuniformity is captured through pixel-to-pixel threshold variation, simulated by the spiking neuron layer. This tandem of loss functions ensures that the model can effectively capture and represent scene geometry.
  • Figure 3: The backpropagation process of spiking neurons. The spiking neuron layer, comprising 256 time steps, follows the NeRF MLP. The weight of the last layer of MLP can be updated through Backpropagation Through Time (BPTT) using Eq. \ref{['eq:neuron']}.
  • Figure 4: Quantitative results on synthetic spike data.
  • Figure 5: Quantitative results on real-world spike data.
  • ...and 6 more figures