Event Quality Score (EQS): Assessing the Realism of Simulated Event Camera Streams via Distances in Latent Space
Kaustav Chanda, Aayush Atul Verma, Arpitsinh Vaghela, Yezhou Yang, Bharatesh Chakravarthi
TL;DR
The paper tackles the sim-to-real gap in event-camera data by introducing Event Quality Score (EQS), a differentiable metric that directly compares two raw event streams using latent features from a pre-trained recurrent vision transformer. EQS operates on event tensors and leverages activations from the first convolutional blocks to quantify similarity via latent-space distances, providing a numeric measure of realism for simulated streams. Empirical results on the DSEC driving dataset show that higher EQS aligns with better generalization of models trained on simulated data to real-world data, with ESIM producing the closest match to real noise patterns and the smallest sim-to-real gap. This metric offers a principled, task-agnostic way to optimize simulators and could be incorporated as a loss to produce more realistic event streams for downstream vision tasks.
Abstract
Event cameras promise a paradigm shift in vision sensing with their low latency, high dynamic range, and asynchronous nature of events. Unfortunately, the scarcity of high-quality labeled datasets hinders their widespread adoption in deep learning-driven computer vision. To mitigate this, several simulators have been proposed to generate synthetic event data for training models for detection and estimation tasks. However, the fundamentally different sensor design of event cameras compared to traditional frame-based cameras poses a challenge for accurate simulation. As a result, most simulated data fail to mimic data captured by real event cameras. Inspired by existing work on using deep features for image comparison, we introduce event quality score (EQS), a quality metric that utilizes activations of the RVT architecture. Through sim-to-real experiments on the DSEC driving dataset, it is shown that a higher EQS implies improved generalization to real-world data after training on simulated events. Thus, optimizing for EQS can lead to developing more realistic event camera simulators, effectively reducing the simulation gap. EQS is available at https://github.com/eventbasedvision/EQS.
