FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

Dongyue Lu; Lingdong Kong; Gim Hee Lee; Camille Simon Chane; Wei Tsang Ooi

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

Dongyue Lu, Lingdong Kong, Gim Hee Lee, Camille Simon Chane, Wei Tsang Ooi

TL;DR

FlexEvent tackles the challenge of object detection with event cameras at varying operational frequencies. It introduces FlexFuse, an adaptive event-frame fusion module, and FlexTune, a frequency-adaptive fine-tuning strategy, enabling robust detection from low to extreme high frequencies (e.g., $20$ Hz to $180$ Hz). The approach leverages a dual-branch architecture (RVT for events and ResNet-50 for frames) with learnable gating to balance modalities across frequencies and uses self-training with pseudo-labels to generalize to unseen temporal resolutions. Empirical results on large-scale DSEC variants show significant mAP gains over state-of-the-art methods and demonstrated robustness across frequency shifts, supporting real-time deployment in dynamic environments.

Abstract

Event cameras offer unparalleled advantages for real-time perception in dynamic environments, thanks to the microsecond-level temporal resolution and asynchronous operation. Existing event detectors, however, are limited by fixed-frequency paradigms and fail to fully exploit the high-temporal resolution and adaptability of event data. To address these limitations, we propose FlexEvent, a novel framework that enables detection at varying frequencies. Our approach consists of two key components: FlexFuse, an adaptive event-frame fusion module that integrates high-frequency event data with rich semantic information from RGB frames, and FlexTune, a frequency-adaptive fine-tuning mechanism that generates frequency-adjusted labels to enhance model generalization across varying operational frequencies. This combination allows our method to detect objects with high accuracy in both fast-moving and static scenarios, while adapting to dynamic environments. Extensive experiments on large-scale event camera datasets demonstrate that our approach surpasses state-of-the-art methods, achieving significant improvements in both standard and high-frequency settings. Notably, our method maintains robust performance when scaling from 20 Hz to 90 Hz and delivers accurate detection up to 180 Hz, proving its effectiveness in extreme conditions. Our framework sets a new benchmark for event-based object detection and paves the way for more adaptable, real-time vision systems.

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

TL;DR

Abstract

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)