Table of Contents
Fetching ...

Efficient Event Camera Volume System

Juan Camilo Soto, Ian Noronha, Saru Bharti, Upinder Kaur

Abstract

Event cameras promise low latency and high dynamic range, yet their sparse output challenges integration into standard robotic pipelines. We introduce \nameframew (Efficient Event Camera Volume System), a novel framework that models event streams as continuous-time Dirac impulse trains, enabling artifact-free compression through direct transform evaluation at event timestamps. Our key innovation combines density-driven adaptive selection among DCT, DTFT, and DWT transforms with transform-specific coefficient pruning strategies tailored to each domain's sparsity characteristics. The framework eliminates temporal binning artifacts while automatically adapting compression strategies based on real-time event density analysis. On EHPT-XC and MVSEC datasets, our framework achieves superior reconstruction fidelity with DTFT delivering the lowest earth mover distance. In downstream segmentation tasks, EECVS demonstrates robust generalization. Notably, our approach demonstrates exceptional cross-dataset generalization: when evaluated with EventSAM segmentation, EECVS achieves mean IoU 0.87 on MVSEC versus 0.44 for voxel grids at 24 channels, while remaining competitive on EHPT-XC. Our ROS2 implementation provides real-time deployment with DCT processing achieving 1.5 ms latency and 2.7X higher throughput than alternative transforms, establishing the first adaptive event compression framework that maintains both computational efficiency and superior generalization across diverse robotic scenarios.

Efficient Event Camera Volume System

Abstract

Event cameras promise low latency and high dynamic range, yet their sparse output challenges integration into standard robotic pipelines. We introduce \nameframew (Efficient Event Camera Volume System), a novel framework that models event streams as continuous-time Dirac impulse trains, enabling artifact-free compression through direct transform evaluation at event timestamps. Our key innovation combines density-driven adaptive selection among DCT, DTFT, and DWT transforms with transform-specific coefficient pruning strategies tailored to each domain's sparsity characteristics. The framework eliminates temporal binning artifacts while automatically adapting compression strategies based on real-time event density analysis. On EHPT-XC and MVSEC datasets, our framework achieves superior reconstruction fidelity with DTFT delivering the lowest earth mover distance. In downstream segmentation tasks, EECVS demonstrates robust generalization. Notably, our approach demonstrates exceptional cross-dataset generalization: when evaluated with EventSAM segmentation, EECVS achieves mean IoU 0.87 on MVSEC versus 0.44 for voxel grids at 24 channels, while remaining competitive on EHPT-XC. Our ROS2 implementation provides real-time deployment with DCT processing achieving 1.5 ms latency and 2.7X higher throughput than alternative transforms, establishing the first adaptive event compression framework that maintains both computational efficiency and superior generalization across diverse robotic scenarios.
Paper Structure (18 sections, 7 equations, 6 figures, 5 tables)

This paper contains 18 sections, 7 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Event-to-dense representation in EECVS. Incoming event streams are processed within the framework and converted into compact dense representations through the application of DCT, DTFT, or DWT.
  • Figure 2: Compression process for a single event window. Events are aggregated, transformed with a basis selected according to activity density, pruned by either low-frequency retention (DCT) or magnitude selection (DTFT/DWT), and packed into dense representations.
  • Figure 3: Per pixel DTFT within a window. Events sample transform bases directly, and coefficients are pruned according to the transform-specific retention strategy.
  • Figure 4: Histogram of event densities across the EHPT-XC and MVSEC datasets. EHPT-XC exhibits high-density real-world sequences, while MVSEC contains more moderate motion-driven densities.
  • Figure 5: Qualitative results on the EHPT-XC dataset, showing robustness in high-density, real-world scenarios.
  • ...and 1 more figures