Table of Contents
Fetching ...

High Dynamic Range Imaging Based on an Asymmetric Event-SVE Camera System

Pengju Sun, Banglei Guan, Jing Tao, Zhenbao Yu, Xuanyu Bai, Yang Shang, Qifeng Yu

Abstract

High dynamic range (HDR) imaging under extreme illumination remains challenging for conventional cameras due to overexposure. Event cameras provide microsecond temporal resolution and high dynamic range, while spatially varying exposure (SVE) sensors offer single-shot radiometric diversity.We present a hardware--algorithm co-designed HDR imaging system that tightly integrates an SVE micro-attenuation camera with an event sensor in an asymmetric dual-modality configuration. To handle non-coaxial geometry and heterogeneous optics, we develop a two-stage cross-modal alignment framework that combines feature-guided coarse homography estimation with a multi-scale refinement module based on spatial pooling and frequency-domain filtering. On top of aligned representations, we develop a cross-modal HDR reconstruction network with convolutional fusion, mutual-information regularization, and a learnable fusion loss that adaptively balances intensity cues and event-derived structural constraints. Comprehensive experiments on both synthetic benchmarks and real captures demonstrate that the proposed system consistently improves highlight recovery, edge fidelity, and robustness compared with frame-only or event-only HDR pipelines. The results indicate that jointly optimizing optical design, cross-modal alignment, and computational fusion provides an effective foundation for reliable HDR perception in highly dynamic and radiometrically challenging environments.

High Dynamic Range Imaging Based on an Asymmetric Event-SVE Camera System

Abstract

High dynamic range (HDR) imaging under extreme illumination remains challenging for conventional cameras due to overexposure. Event cameras provide microsecond temporal resolution and high dynamic range, while spatially varying exposure (SVE) sensors offer single-shot radiometric diversity.We present a hardware--algorithm co-designed HDR imaging system that tightly integrates an SVE micro-attenuation camera with an event sensor in an asymmetric dual-modality configuration. To handle non-coaxial geometry and heterogeneous optics, we develop a two-stage cross-modal alignment framework that combines feature-guided coarse homography estimation with a multi-scale refinement module based on spatial pooling and frequency-domain filtering. On top of aligned representations, we develop a cross-modal HDR reconstruction network with convolutional fusion, mutual-information regularization, and a learnable fusion loss that adaptively balances intensity cues and event-derived structural constraints. Comprehensive experiments on both synthetic benchmarks and real captures demonstrate that the proposed system consistently improves highlight recovery, edge fidelity, and robustness compared with frame-only or event-only HDR pipelines. The results indicate that jointly optimizing optical design, cross-modal alignment, and computational fusion provides an effective foundation for reliable HDR perception in highly dynamic and radiometrically challenging environments.
Paper Structure (19 sections, 25 equations, 6 figures, 3 tables)

This paper contains 19 sections, 25 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of event-assisted HDR reconstruction. Top: a conventional event+frame HDR pipeline typically fuses an intensity frame with an event representation using fixed fusion loss. Bottom: our asymmetric Event-SVE HDR system encodes multi-exposure radiometric measurements and performs calibration-guided coarse alignment followed by learnable refinement due to the non-coaxial setup.
  • Figure 2: Hybrid Event-SVE imaging platform. (a) Acquisition architecture in which a programmable trigger generator synchronizes the SVE camera’s exposure cycles with the event sensor’s asynchronous readout. (b) Temporal sampling characteristics showing discrete multi-exposure frame capture versus continuous event generation. (c) Hardware prototype with independent optical paths and a custom synchronization controller that provides a shared trigger reference for event-frame pairing.
  • Figure 3: Overview of the proposed cross-modal alignment and HDR fusion network. Multi-exposure SVE frames and event streams are encoded into multi-scale feature pyramids. At each pyramid level, the features are refined through spatial pooling and frequency-domain convolution to achieve cross-modal alignment. The aligned features are then aggregated and passed through an encoder–decoder sub-network to reconstruct the final HDR image.
  • Figure 4: Structure diagram of the FDconv module. FDConv transforms features to the frequency domain, applies a learnable spectral response via element-wise complex multiplication, and transforms back.
  • Figure 5: Qualitative comparison on the SVE-HDR dataset. Zoom in for better visualization of details. Our results provide more details and visual effects than other methods.
  • ...and 1 more figures