Table of Contents
Fetching ...

Hybrid Event Frame Sensors: Modeling, Calibration, and Simulation

Yunfan Lu, Nico Messikommer, Xiaogang Xu, Liming Chen, Yuhan Chen, Nikola Zubic, Davide Scaramuzza, Hui Xiong

TL;DR

This work addresses the challenge of accurately modeling and simulating hybrid APS–EVS sensors by introducing a unified, statistics-based noise model that jointly captures photon shot noise, dark current noise, fixed-pattern noise, and quantization noise for both APS frames and EVS events. A calibration pipeline, leveraging per-pixel Quad-Bayer positioning and an APS–EVS domain mapping via a $Q$-function, yields interpretable noise parameters that enable realistic joint noise generation. Building on this, the authors introduce H-ESIM, a physics-grounded simulator that produces RAW frames and events from high-frame-rate video, with an open ISP and data-driven parameter estimation, achieving strong transfer to real data for tasks like video frame interpolation and deblurring. The framework is validated on GEN2 and Eiger hybrid sensors, showing accurate noise representation and improved downstream performance when models are fine-tuned on synthetic data, thereby enabling reproducible research and robust evaluation of hybrid-event vision systems.

Abstract

Event frame hybrid sensors integrate an Active Pixel Sensor (APS) and an Event Vision Sensor (EVS) within a single chip, combining the high dynamic range and low latency of the EVS with the rich spatial intensity information from the APS. While this tight integration offers compact, temporally precise imaging, the complex circuit architecture introduces non-trivial noise patterns that remain poorly understood and unmodeled. In this work, we present the first unified, statistics-based imaging noise model that jointly describes the noise behavior of APS and EVS pixels. Our formulation explicitly incorporates photon shot noise, dark current noise, fixed-pattern noise, and quantization noise, and links EVS noise to illumination level and dark current. Based on this formulation, we further develop a calibration pipeline to estimate noise parameters from real data and offer a detailed analysis of both APS and EVS noise behaviors. Finally, we propose HESIM, a statistically grounded simulator that generates RAW frames and events under realistic, jointly calibrated noise statistics. Experiments on two hybrid sensors validate our model across multiple imaging tasks (e.g., video frame interpolation and deblurring), demonstrating strong transfer from simulation to real data.

Hybrid Event Frame Sensors: Modeling, Calibration, and Simulation

TL;DR

This work addresses the challenge of accurately modeling and simulating hybrid APS–EVS sensors by introducing a unified, statistics-based noise model that jointly captures photon shot noise, dark current noise, fixed-pattern noise, and quantization noise for both APS frames and EVS events. A calibration pipeline, leveraging per-pixel Quad-Bayer positioning and an APS–EVS domain mapping via a -function, yields interpretable noise parameters that enable realistic joint noise generation. Building on this, the authors introduce H-ESIM, a physics-grounded simulator that produces RAW frames and events from high-frame-rate video, with an open ISP and data-driven parameter estimation, achieving strong transfer to real data for tasks like video frame interpolation and deblurring. The framework is validated on GEN2 and Eiger hybrid sensors, showing accurate noise representation and improved downstream performance when models are fine-tuned on synthetic data, thereby enabling reproducible research and robust evaluation of hybrid-event vision systems.

Abstract

Event frame hybrid sensors integrate an Active Pixel Sensor (APS) and an Event Vision Sensor (EVS) within a single chip, combining the high dynamic range and low latency of the EVS with the rich spatial intensity information from the APS. While this tight integration offers compact, temporally precise imaging, the complex circuit architecture introduces non-trivial noise patterns that remain poorly understood and unmodeled. In this work, we present the first unified, statistics-based imaging noise model that jointly describes the noise behavior of APS and EVS pixels. Our formulation explicitly incorporates photon shot noise, dark current noise, fixed-pattern noise, and quantization noise, and links EVS noise to illumination level and dark current. Based on this formulation, we further develop a calibration pipeline to estimate noise parameters from real data and offer a detailed analysis of both APS and EVS noise behaviors. Finally, we propose HESIM, a statistically grounded simulator that generates RAW frames and events under realistic, jointly calibrated noise statistics. Experiments on two hybrid sensors validate our model across multiple imaging tasks (e.g., video frame interpolation and deblurring), demonstrating strong transfer from simulation to real data.

Paper Structure

This paper contains 9 sections, 12 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: (a) Hybrid sensor with interleaved APS and EVS pixels, example Quad-Bayer layout. (b) Our framework: i. a unified imaging noise model $\mathcal{M}_{\beta}$ for both APS and EVS (Sec. \ref{['sec:modeling']}); ii. calibration of APS/EVS noise (Sec. \ref{['sec:calibration']}); iii. Hybrid event sensor simulator, H-ESIM, which generates RAW frames and events (Sec. \ref{['sec:simulator']}). iv. downstream tasks, such as video frame interpolation, video deblurring. (Sec. \ref{['sec:downstream_tasks']})
  • Figure 2: Hybrid sensor imaging pipeline and simplified pixel circuits. (a) Real scene. (b) Hybrid event sensor. (c) APS path: photons filtered by the CFA generate charge in the photodiode, which is integrated and read through a shared row bus and column ADC. The noise sources include photon shot noise $N_{\text{shot}}$, dark-current shot noise $N_{\text{DCSN}}$, black-level correction noise $N_{\text{BLC}}$, row noise $N_{\text{row}}$, and quantization noise $N_q$. (d) EVS path: the photodiode drives a logarithmic stage, differentiator, and comparator with threshold $\Theta$ to emit events; dominant noise arises from photon shot and dark-current processes, fixed-pattern offsets. The shared photoreceptor ensures tight temporal and spatial alignment, while the mixed layout introduces cross-coupled noise that our model and calibration explicitly address.
  • Figure 3: Key steps of H-ESIM. (I) Input and inverse colorimetric mapping: (a)–(d) map a 3200 fps video to per-pixel intensity $I_c$ and APS/EVS CFA through inverse gamma and inverse color matrix and white balance. (II) APS simulator with calibrated noise: (e)–(g) add fixed terms $N_{\text{row}}$, $N_{\text{BLC}}$, and $\Delta t N_{\text{DP}}$, then sample the illumination/exposure–dependent variance $f_a(I_c,\Delta t;\bm{\beta}a)$ to synthesize RAW (Eq. \ref{['eq:aps_noise_groups']}, Eq. \ref{['eq:aps_noise_items']}, Eq. \ref{['eq:order_2_var']}). (III) EVS simulator with statistically modeled triggering: (h)–(k) map $I_c$ to voltage, form the signal $S$ and noise $N_e$ with offsets $\mu_n$, compute $P_+$ and $P_-$ through $Q(\cdot)$ and threshold $\theta$ (Eq. \ref{['eq:event_sigma_final_noise']}, Eq. \ref{['eq:event_estimation_final_noise']}), and sample events using $\bm{\beta}_e$.
  • Figure 4: APS noise calibration on GEN2 and Eiger. (a, b) Fixed noise . (c) Measured per-pixel noise variance at $\Delta t=80$,ms. (d, e) Histograms of measured noise and model-predicted noise for the same scene. (f, g) Measured vs. estimated variance across brightness and CFA positions at 80,ms; points denote pixels and dashed curves indicate per-position fits.
  • Figure 5: Noise events on GEN2 and Eiger. (a, b) Event-probability visualization from GEN2 and Eiger. (c) Log-scale histogram of event probability with a linear fit. (d) Event-probability vs. brightness. (e, f) Per-pixel positive vs. negative event counts for an illuminated scene and for a dark scene, the diagonal marks equality.
  • ...and 1 more figures