Table of Contents
Fetching ...

GERD: Geometric event response data generation

Jens Egholm Pedersen, Dimitris Korakovounis, Jörg Conradt

TL;DR

The paper addresses the lack of geometric grounding in event-based vision by introducing GERD, a simulator that generates synthetic event streams from objects undergoing controlled affine and temporal transformations. It details a pipeline that renders shapes, applies sub-pixel, time-varying transformations, and produces sparse, two-channel events with configurable noise and a PyTorch data loader. The authors showcase applications for mock stimuli, transformation-invariance testing, and covariance analysis to probe how event-based systems respond to geometric changes. This toolbox provides a principled sandbox to study transformation effects and to train models with improved generalization for event-based vision, potentially bridging gaps with traditional frame-based approaches.

Abstract

Event-based vision sensors are appealing because of their time resolution, higher dynamic range, and low-power consumption. They also provide data that is fundamentally different from conventional frame-based cameras: events are sparse, discrete, and require integration in time. Unlike conventional models grounded in established geometric and physical principles, event-based models lack comparable foundations. We introduce a method to generate event-based data under controlled transformations. Specifically, we subject a prototypical object to transformations that change over time to produce carefully curated event videos. We hope this work simplifies studies for geometric approaches in event-based vision. GERD is available at https://github.com/ncskth/gerd

GERD: Geometric event response data generation

TL;DR

The paper addresses the lack of geometric grounding in event-based vision by introducing GERD, a simulator that generates synthetic event streams from objects undergoing controlled affine and temporal transformations. It details a pipeline that renders shapes, applies sub-pixel, time-varying transformations, and produces sparse, two-channel events with configurable noise and a PyTorch data loader. The authors showcase applications for mock stimuli, transformation-invariance testing, and covariance analysis to probe how event-based systems respond to geometric changes. This toolbox provides a principled sandbox to study transformation effects and to train models with improved generalization for event-based vision, potentially bridging gaps with traditional frame-based approaches.

Abstract

Event-based vision sensors are appealing because of their time resolution, higher dynamic range, and low-power consumption. They also provide data that is fundamentally different from conventional frame-based cameras: events are sparse, discrete, and require integration in time. Unlike conventional models grounded in established geometric and physical principles, event-based models lack comparable foundations. We introduce a method to generate event-based data under controlled transformations. Specifically, we subject a prototypical object to transformations that change over time to produce carefully curated event videos. We hope this work simplifies studies for geometric approaches in event-based vision. GERD is available at https://github.com/ncskth/gerd

Paper Structure

This paper contains 11 sections, 6 figures.

Figures (6)

  • Figure 1: The three built-in shape templates in the dataset are used to generate sparse signals when moved. (a) The prototypical shapes: square, circle, and triangle. (b) The shapes translated to the right. (c) The difference between two frames generates a sparse frame with positive changes in green and negative changes in red.
  • Figure 2: By controlling the amount of transformation, we can control the amount of signal per time step. Here, a circle is shrinking by three different amounts for a single timestep, producing increasingly sparse frames. The red color indicates a negative polarity change.
  • Figure 3: We upsample and integrate pixels to compensate for aliasing effects when generating events. (a) A downsampled image of the top-left part of a circle. Each pixel is either on or off. (b) When upsampling the image from (a) we increase the granularity of the integration.
  • Figure 4: A subset of the rendering parameters. F represents a function type that changes the translation velocity over time. T represents a PyTorch tensor type.
  • Figure 5: A square that is scaled by one pixel, subject to three types of noise. (a) 10% general background noise. (b) 10% noise in the geometry. (c) 10% event sampling noise.
  • ...and 1 more figures