Gradient events: improved acquisition of visual information in event cameras

Eero Lehtonen; Tuomo Komulainen; Ari Paasio; Mika Laiho

Gradient events: improved acquisition of visual information in event cameras

Eero Lehtonen, Tuomo Komulainen, Ari Paasio, Mika Laiho

TL;DR

The paper addresses the challenge of informative visual information capture with event cameras in the presence of flickering illumination by introducing gradient events built from ternary image gradients with a position-dependent threshold $\Theta$. These gradient events enable robust, per-pixel asynchronous encoding and straightforward Poisson-based grayscale reconstruction via the discrete Laplacian $\nabla^2 I$ and a successive over-relaxation (SOR) solver, using only three tunable parameters. Empirically, gradient-event reconstructions outperform state-of-the-art brightness-event methods across multiple public datasets, with benefits including reduced lag and hardware-friendly computation. The work suggests gradient events as a promising, hardware-efficient approach to improve visual information acquisition in event-based cameras and potential utility for downstream computer vision tasks.

Abstract

The current event cameras are bio-inspired sensors that respond to brightness changes in the scene asynchronously and independently for every pixel, and transmit these changes as ternary event streams. Event cameras have several benefits over conventional digital cameras, such as significantly higher temporal resolution and pixel bandwidth resulting in reduced motion blur, and very high dynamic range. However, they also introduce challenges such as the difficulty of applying existing computer vision algorithms to the output event streams, and the flood of uninformative events in the presence of oscillating light sources. Here we propose a new type of event, the gradient event, which benefits from the same properties as a conventional brightness event, but which is by design much less sensitive to oscillating light sources, and which enables considerably better grayscale frame reconstruction. We show that the gradient event -based video reconstruction outperforms existing state-of-the-art brightness event -based methods by a significant margin, when evaluated on publicly available event-to-video datasets. Our results show how gradient information can be used to significantly improve the acquisition of visual information by an event camera.

Gradient events: improved acquisition of visual information in event cameras

TL;DR

. These gradient events enable robust, per-pixel asynchronous encoding and straightforward Poisson-based grayscale reconstruction via the discrete Laplacian

and a successive over-relaxation (SOR) solver, using only three tunable parameters. Empirically, gradient-event reconstructions outperform state-of-the-art brightness-event methods across multiple public datasets, with benefits including reduced lag and hardware-friendly computation. The work suggests gradient events as a promising, hardware-efficient approach to improve visual information acquisition in event-based cameras and potential utility for downstream computer vision tasks.

Abstract

Paper Structure (12 sections, 20 equations, 7 figures, 3 tables)

This paper contains 12 sections, 20 equations, 7 figures, 3 tables.

Introduction
Methods
From gradient images to ternary gradients
Gradient events
Resolution compression
Reconstruction of grayscale images
Computing an approximation of the discrete Laplacian
Solving the Poisson's equation with successive over-relaxation
Reconstruction example
Results
Discussion
Conclusion

Figures (7)

Figure 1: Ternary gradients using the position dependent thresholds $\{t_0 = 4/255, \ t_1 = 8/255, \ t_2 = 16/255\}$. Left inset: original image. Middle inset: horizontal ternary gradient $T_X(x,y)$. Right inset: vertical ternary gradient $T_Y(x,y)$. In the middle and right insets, ternary gradients equal to $+1$ are represented by white pixels, and ternary gradients equal to $-1$ are represented by black pixels. Gray pixels denote positions where the ternary gradient equals zero.
Figure 2: Parallelization of the SOR algorithm. Here $R(x,y)$ denotes the reconstructed value at a given iteration step of the algorithm. Each step of iteration consists of first updating the cells visualized by solid circles, and then updating the cells visualized by dashed circles --- the lattice continues with alternating solid and dashed circles. For simplifying the notation, here the term $\beta$ is the over-relaxation parameter divided by four, $\beta = \alpha / 4$.
Figure 3: Reconstruction from approximate Laplacian using the SOR algorithm with $\alpha=1.97$ and $k=100$ iterations. Left inset: original image. Middle inset: reconstruction using quantized gradients $\hat{G}_X(x,y)$ and $\hat{G}_Y(x,y)$. Right inset: reconstruction using resolution compressed approximate gradients $\hat{G}_X^\textrm{RC}(x,y)$ and $\hat{G}_Y^\textrm{RC}(x,y)$. The approximate Laplacians are multiplied by the constant $c=3.6$ in order to make the variance of the reconstructed image similar to the variance of the original image. For visualization, the mean values of the reconstructed images were subtracted from the reconstructed images and the mean of the original image was added to them. As can be seen, the resolution compression yields reconstruction artefacts especially in high-frequency parts of the image.
Figure 4: Number of frames corresponding to event probability (the number of events normalized by the number of pixels in a frame) over the datasets ECD, MVSEC and HQF, corresponding to the ground truth grayscale frames. For the brightness events, at most one event per pixel position was counted, and the count was obtained for timestamps between the previous and the considered ground truth frame. For the gradient events, resolution compression was used in order to have the spatial resolution the same as for brightness events. The total event probability for gradient events with resolution compression was approximately 0.11, and the total event probability for brightness events was approximately 0.15.
Figure 5: Effect of using multiple thresholds on the reconstruction quality. Left inset: original image. Middle inset: thresholds $\{4/256, 8/256, 16/256\}$ and scaling parameter $c=3.6$. Right inset: threshold $4/256$ and scaling parameter $c=4.3$. There are more reconstruction artefacts for example around the head and the left hand when using only one threshold as compared to using three thresholds.
...and 2 more figures

Gradient events: improved acquisition of visual information in event cameras

TL;DR

Abstract

Gradient events: improved acquisition of visual information in event cameras

Authors

TL;DR

Abstract

Table of Contents

Figures (7)