Rethinking Video with a Universal Event-Based Representation

Andrew Freeman

Rethinking Video with a Universal Event-Based Representation

Andrew Freeman

TL;DR

The dissertation introduces ADDER, a universal event-based representation that transcends traditional frame-based video and existing event sensors by encoding per-pixel absolute intensities as asynchronous events. Through Acquisition, Representation, and Application layers, ADDER unifies framed and event data via transcoders, enables source-modelled lossy compression, and supports both frame reconstruction and bespoke asynchronous vision tasks. Key contributions include a detailed event-pixel list transcoder model, a two-stage compression approach with CABAC-compatible entropy coding, and substantial demonstrations of application-speed gains and data-rate reductions across framed and DVS/DAVIS sources. The work also provides open-source Rust-based software for end-to-end transcoding, compression, and visualization, highlighting practical utility for surveillance, edge computing, and future intensity-event sensors like ASINT and Aeveon. Collectively, ADDER enables scalable, rate-adaptive, and cross-modal vision pipelines with broad implications for large-scale video analytics and resource-constrained sensing.

Abstract

Traditionally, video is structured as a sequence of discrete image frames. Recently, however, a novel video sensing paradigm has emerged which eschews video frames entirely. These "event" sensors aim to mimic the human vision system with asynchronous sensing, where each pixel has an independent, sparse data stream. While these cameras enable high-speed and high-dynamic-range sensing, researchers often revert to a framed representation of the event data for existing applications, or build bespoke applications for a particular camera's event data type. At the same time, classical video systems have significant computational redundancy at the application layer, since pixel samples are repeated across frames in the uncompressed domain. To address the shortcomings of existing systems, I introduce Address, Decimation, Δt Event Representation (ADΔER, pronounced "adder"), a novel intermediate video representation and system framework. The framework transcodes a variety of framed and event camera sources into a single event-based representation, which supports source-modeled lossy compression and backward compatibility with traditional frame-based applications. I demonstrate that ADΔER achieves state-of-the-art application speed and compression performance for scenes with high temporal redundancy. Crucially, I describe how ADΔER unlocks an entirely new control mechanism for computer vision: application speed can correlate with both the scene content and the level of lossy compression. Finally, I discuss the implications for event-based video on large-scale video surveillance and resource-constrained sensing.

Rethinking Video with a Universal Event-Based Representation

TL;DR

Abstract

Paper Structure (140 sections, 17 equations, 55 figures, 7 tables)

This paper contains 140 sections, 17 equations, 55 figures, 7 tables.

Introduction
Definitions
Motivation
Thesis Statement
Proposed System and Contributions
Acquisition Layer
Representation Layer
Application Layer
Dissertation Overview
Background: Data Compression and Video Systems
Introduction
Data Compression
Arithmetic Coding
Adaptive Arithmetic Coding
Context-Adaptive Binary Arithmetic Coding
...and 125 more sections

Figures (55)

Figure 1: Abstract overview of the AD$\Delta$ER framework.
Figure 2: Detailed diagram of the three-layer AD$\Delta$ER framework. Italicized names reflect the names of software packages in the Rust Package Registry. Dashed lines indicate future work. With AD$\Delta$ER, framed and event-based video sources can be transcoded to a common representation. Since there is a single raw representation, we can have a simple source-modeled compression scheme. The representation supports bespoke event-based applications, while being backwards compatible with classical applications.
Figure 3: Abstract overview of the AD$\Delta$ER transcoder.
Figure 4: A naive ASCII encoder. The output data is unchanged from the input.
Figure 5: An encoder for only capital letters with equal probability.
...and 50 more figures

Theorems & Definitions (1)

Definition

Rethinking Video with a Universal Event-Based Representation

TL;DR

Abstract

Rethinking Video with a Universal Event-Based Representation

Authors

TL;DR

Abstract

Table of Contents

Figures (55)

Theorems & Definitions (1)