Table of Contents
Fetching ...

CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras

Sachin Shah, Matthew Albert Chan, Haoming Cai, Jingxi Chen, Sakshum Kulshrestha, Chahat Deep Singh, Yiannis Aloimonos, Christopher Metzler

TL;DR

Tests establish theoretical limits (Cramér Rao bounds) on 3D point localization and tracking with PSF-engineered event cameras and designs are designed to overcome the non-convexity of the design problem.

Abstract

Point-spread-function (PSF) engineering is a well-established computational imaging technique that uses phase masks and other optical elements to embed extra information (e.g., depth) into the images captured by conventional CMOS image sensors. To date, however, PSF-engineering has not been applied to neuromorphic event cameras; a powerful new image sensing technology that responds to changes in the log-intensity of light. This paper establishes theoretical limits (Cramér Rao bounds) on 3D point localization and tracking with PSF-engineered event cameras. Using these bounds, we first demonstrate that existing Fisher phase masks are already near-optimal for localizing static flashing point sources (e.g., blinking fluorescent molecules). We then demonstrate that existing designs are sub-optimal for tracking moving point sources and proceed to use our theory to design optimal phase masks and binary amplitude masks for this task. To overcome the non-convexity of the design problem, we leverage novel implicit neural representation based parameterizations of the phase and amplitude masks. We demonstrate the efficacy of our designs through extensive simulations. We also validate our method with a simple prototype.

CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras

TL;DR

Tests establish theoretical limits (Cramér Rao bounds) on 3D point localization and tracking with PSF-engineered event cameras and designs are designed to overcome the non-convexity of the design problem.

Abstract

Point-spread-function (PSF) engineering is a well-established computational imaging technique that uses phase masks and other optical elements to embed extra information (e.g., depth) into the images captured by conventional CMOS image sensors. To date, however, PSF-engineering has not been applied to neuromorphic event cameras; a powerful new image sensing technology that responds to changes in the log-intensity of light. This paper establishes theoretical limits (Cramér Rao bounds) on 3D point localization and tracking with PSF-engineered event cameras. Using these bounds, we first demonstrate that existing Fisher phase masks are already near-optimal for localizing static flashing point sources (e.g., blinking fluorescent molecules). We then demonstrate that existing designs are sub-optimal for tracking moving point sources and proceed to use our theory to design optimal phase masks and binary amplitude masks for this task. To overcome the non-convexity of the design problem, we leverage novel implicit neural representation based parameterizations of the phase and amplitude masks. We demonstrate the efficacy of our designs through extensive simulations. We also validate our method with a simple prototype.
Paper Structure (24 sections, 1 theorem, 19 equations, 16 figures, 4 tables)

This paper contains 24 sections, 1 theorem, 19 equations, 16 figures, 4 tables.

Key Result

Lemma S4.1

The log-intensity difference, $f(t_\text{end}) - f(t_\text{start})$, is proportional to the binned event pixel value, $\sum_{i=1}^n p_i$, with error $|\epsilon|<1$.

Figures (16)

  • Figure 1: Prototype. Top: The fabricated mask is placed at the aperture plane of an event camera with a $50$mm focal length lens. Bottom: Sample captured event frames for a point source.
  • Figure 2: Binning events approximates the log difference as the number of accumulated frames increases. Consider a point source moving from the blue location to the red location at depth plane $1\mu$m over a fixed time interval in the first image. The second image illustrates the direct access to the difference in \ref{['eq:optimistic-evt']}, while the subsequent images demonstrate the effect of accumulating $N$ event frames across the time interval. Observe how large $N$ nearly recovers $\Delta L$, demonstrating the validity of the approximation.
  • Figure 2: Real-world 3D tracking. Comparison between NAM and Open apertures for depth estimation at 1000FPS. Error bars show the 90% interquartile range.
  • Figure 3: System overview. (a) An MLP produces a phase or amplitude mask based on a grid of $x,y$ coordinates. The weights are updated through back-propagation of the CRB computed with Brownian Motion. (b) In simulation, coded events are generated by first rendering high-frame-rate coded CMOS frames and converting them to event frames. These measurements are passed to a 3D-tracking algorithm.
  • Figure 3: Designed Phase Masks and corresponding PSFs for specific speeds. Each row visualizes the neural phase mask designed for tracking particles moving at $N$ nanometers per time interval. Observe that the optimal design for 'fast' moving particles is the Fisher design.
  • ...and 11 more figures

Theorems & Definitions (2)

  • Lemma S4.1
  • proof