Table of Contents
Fetching ...

Generalized Event Cameras

Varun Sundar, Matthew Dutson, Andrei Ardelean, Claudio Bruschini, Edoardo Charbon, Mohit Gupta

TL;DR

This work addresses the limitations of traditional event cameras, which capture only changes in brightness and often lose rich intensity information. It proposes generalized event cameras implemented on SPAD sensors, introducing two key axes: the integrator ($\Sigma$) and the change detector ($\Delta$), enabling intensity-preserving, bandwidth-efficient imaging. The authors develop multiple SPAD-based designs—adaptive-EMA Bayes, spatiotemporal chunks, and coded-exposure events—and demonstrate exceptional high-speed reconstructions (up to $3025$ FPS) with dramatic readout reductions (~$80\times$) while enabling plug-and-play inference with standard vision models. They further show on-chip feasibility on the UltraPhase architecture and discuss practical limitations and near-term improvements, highlighting the practical potential of intensity-preserving, near-sensor processing with single-photon sensors for general-purpose, high-frame-rate imaging.

Abstract

Event cameras capture the world at high time resolution and with minimal bandwidth requirements. However, event streams, which only encode changes in brightness, do not contain sufficient scene information to support a wide variety of downstream tasks. In this work, we design generalized event cameras that inherently preserve scene intensity in a bandwidth-efficient manner. We generalize event cameras in terms of when an event is generated and what information is transmitted. To implement our designs, we turn to single-photon sensors that provide digital access to individual photon detections; this modality gives us the flexibility to realize a rich space of generalized event cameras. Our single-photon event cameras are capable of high-speed, high-fidelity imaging at low readout rates. Consequently, these event cameras can support plug-and-play downstream inference, without capturing new event datasets or designing specialized event-vision models. As a practical implication, our designs, which involve lightweight and near-sensor-compatible computations, provide a way to use single-photon sensors without exorbitant bandwidth costs.

Generalized Event Cameras

TL;DR

This work addresses the limitations of traditional event cameras, which capture only changes in brightness and often lose rich intensity information. It proposes generalized event cameras implemented on SPAD sensors, introducing two key axes: the integrator () and the change detector (), enabling intensity-preserving, bandwidth-efficient imaging. The authors develop multiple SPAD-based designs—adaptive-EMA Bayes, spatiotemporal chunks, and coded-exposure events—and demonstrate exceptional high-speed reconstructions (up to FPS) with dramatic readout reductions (~) while enabling plug-and-play inference with standard vision models. They further show on-chip feasibility on the UltraPhase architecture and discuss practical limitations and near-term improvements, highlighting the practical potential of intensity-preserving, near-sensor processing with single-photon sensors for general-purpose, high-frame-rate imaging.

Abstract

Event cameras capture the world at high time resolution and with minimal bandwidth requirements. However, event streams, which only encode changes in brightness, do not contain sufficient scene information to support a wide variety of downstream tasks. In this work, we design generalized event cameras that inherently preserve scene intensity in a bandwidth-efficient manner. We generalize event cameras in terms of when an event is generated and what information is transmitted. To implement our designs, we turn to single-photon sensors that provide digital access to individual photon detections; this modality gives us the flexibility to realize a rich space of generalized event cameras. Our single-photon event cameras are capable of high-speed, high-fidelity imaging at low readout rates. Consequently, these event cameras can support plug-and-play downstream inference, without capturing new event datasets or designing specialized event-vision models. As a practical implication, our designs, which involve lightweight and near-sensor-compatible computations, provide a way to use single-photon sensors without exorbitant bandwidth costs.
Paper Structure (70 sections, 17 equations, 29 figures, 2 tables, 5 algorithms)

This paper contains 70 sections, 17 equations, 29 figures, 2 tables, 5 algorithms.

Figures (29)

  • Figure 1: Generalized event cameras.(top) Event cameras generate outputs in response to abrupt changes in scene intensity. We describe this as a combination of a low-pass integrator and a threshold-based change detector. (middle) We generalize the space of event cameras by designing integrators that capture rich intensity information, and more reliable change detectors that utilize larger spatiotemporal contexts and noise-aware thresholding (\ref{['sec:bitplane_events', 'sec:chunk_events', 'sec:coded_events']}). Unlike existing events, our generalized event streams inherently preserve scene intensity, e.g., this ping-pong ball slingshotted against a brick wall backdrop. (bottom) Generalized event cameras enable high-fidelity bandwidth-efficient imaging: providing $3025$ FPS reconstructions with a readout equivalent to a $30$ FPS camera. Consequently, generalized events facilitate plug-and-play inference on a multitude of tasks in challenging scenarios (insets depict the extent of motion over $30$ ms).
  • Figure 2: Altering "what to transmit."(a) We sum the events generated by a jack-in-the-box toy as it springs up. This sum gives a lossy encoding of brightness changes in dynamic regions. (b) Transmitting levels instead of changes helps recover details in static regions. (c) Adaptive exposures, which accumulate flux between consecutive events, provide substantial noise reduction.
  • Figure 3: Bayesian- vs. EMA-based change detection. (left) A fixed-threshold change detector (used in adaptive-EMA) makes it difficult to segment low-contrast changes. (center) The Bayesian formulation attunes to the stochasticity in incident flux and can detect fine-grained changes such as the corners of the hole saw bit; (right) as a result, the integrator captures the rotational dynamics.
  • Figure 4: Spatiotemporal chunk events. We evaluate the difference between the current chunk and a stored reference in a learned linear-feature space. Unlike the L2 norm, which is permutation-invariant, the feature-space norm is sensitive to spatial structure. Randomly shuffling the pixel values reduces the transform-domain norm (the shuffled patch has a more "noise-like" structure).
  • Figure 5: High-speed videography of a stress ball hurled at a coffee mug.(top row) This indoor scene is challenging for existing imaging systems, including: high-speed cameras (SNR-related artifacts), event cameras (poor restoration quality), and even hybrid event + frame techniques (reconstruction artifacts). (bottom rows) In contrast, our generalized event cameras capture the stress ball's extensive deformations with high fidelity and an efficient readout.
  • ...and 24 more figures