EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

Chengjie Ge; Xueyang Fu; Peng He; Kunyu Wang; Chengzhi Cao; Zheng-Jun Zha

EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

Chengjie Ge, Xueyang Fu, Peng He, Kunyu Wang, Chengzhi Cao, Zheng-Jun Zha

TL;DR

EventMamba tackles EBVR by addressing translation invariance and spatio-temporal locality losses in existing Vision Mamba models. It introduces Random Window Offset for spatial domains and Hilbert/Trans-Hilbert space-filling curve serialization for temporal-spatial ordering, integrated into a Mamba-based architecture. Across HQF, IJRR, and MVSEC, it achieves state-of-the-art or competitive reconstruction quality with favorable speed, demonstrating strong practical potential for real-time EBVR on resource-constrained devices. These innovations advance EBVR applicability by delivering high-fidelity reconstructions with efficient computation.

Abstract

Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.

EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

TL;DR

Abstract

EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)