Table of Contents
Fetching ...

iHDR: Iterative HDR Imaging with Arbitrary Number of Exposures

Yu Yuan, Yiheng Chi, Xingguang Zhang, Stanley Chan

TL;DR

This work tackles the limitation of HDR methods that fix the number of input exposures by introducing iHDR, an iterative HDR fusion framework capable of handling an arbitrary number of LDR inputs without retraining. It combines a ghost-free dual-input fusion network (DiHDR) with a physics-based domain mapper (ToneNet) and a semi-cross attention transformer (SCAT) that leverages side information such as pseudo-HDR images, structure tensors, and difference masks. Empirical results on standard HDR benchmarks and a newly collected 9-input dataset show that iHDR achieves superior ghosting suppression, better detail recovery, and robust performance as the number of inputs increases, outperforming state-of-the-art HDR deghosting and tonemapping baselines. The approach offers a scalable, flexible HDR solution for dynamic scenes with efficient computation, enabling practical deployment in real-world imaging pipelines.

Abstract

High dynamic range (HDR) imaging aims to obtain a high-quality HDR image by fusing information from multiple low dynamic range (LDR) images. Numerous learning-based HDR imaging methods have been proposed to achieve this for static and dynamic scenes. However, their architectures are mostly tailored for a fixed number (e.g., three) of inputs and, therefore, cannot apply directly to situations beyond the pre-defined limited scope. To address this issue, we propose a novel framework, iHDR, for iterative fusion, which comprises a ghost-free Dual-input HDR fusion network (DiHDR) and a physics-based domain mapping network (ToneNet). DiHDR leverages a pair of inputs to estimate an intermediate HDR image, while ToneNet maps it back to the nonlinear domain and serves as the reference input for the next pairwise fusion. This process is iteratively executed until all input frames are utilized. Qualitative and quantitative experiments demonstrate the effectiveness of the proposed method as compared to existing state-of-the-art HDR deghosting approaches given flexible numbers of input frames.

iHDR: Iterative HDR Imaging with Arbitrary Number of Exposures

TL;DR

This work tackles the limitation of HDR methods that fix the number of input exposures by introducing iHDR, an iterative HDR fusion framework capable of handling an arbitrary number of LDR inputs without retraining. It combines a ghost-free dual-input fusion network (DiHDR) with a physics-based domain mapper (ToneNet) and a semi-cross attention transformer (SCAT) that leverages side information such as pseudo-HDR images, structure tensors, and difference masks. Empirical results on standard HDR benchmarks and a newly collected 9-input dataset show that iHDR achieves superior ghosting suppression, better detail recovery, and robust performance as the number of inputs increases, outperforming state-of-the-art HDR deghosting and tonemapping baselines. The approach offers a scalable, flexible HDR solution for dynamic scenes with efficient computation, enabling practical deployment in real-world imaging pipelines.

Abstract

High dynamic range (HDR) imaging aims to obtain a high-quality HDR image by fusing information from multiple low dynamic range (LDR) images. Numerous learning-based HDR imaging methods have been proposed to achieve this for static and dynamic scenes. However, their architectures are mostly tailored for a fixed number (e.g., three) of inputs and, therefore, cannot apply directly to situations beyond the pre-defined limited scope. To address this issue, we propose a novel framework, iHDR, for iterative fusion, which comprises a ghost-free Dual-input HDR fusion network (DiHDR) and a physics-based domain mapping network (ToneNet). DiHDR leverages a pair of inputs to estimate an intermediate HDR image, while ToneNet maps it back to the nonlinear domain and serves as the reference input for the next pairwise fusion. This process is iteratively executed until all input frames are utilized. Qualitative and quantitative experiments demonstrate the effectiveness of the proposed method as compared to existing state-of-the-art HDR deghosting approaches given flexible numbers of input frames.

Paper Structure

This paper contains 20 sections, 8 equations, 12 figures, 4 tables.

Figures (12)

  • Figure 1: The proposed iHDR framework. ToneNet maps the linear HDR obtained from DiHDR back to the nonlinear domain consistent with the inputs.
  • Figure 2: Three types of side information of inputs are utilized to enhance the learning capability of the network and suppress artifacts, including pseudo-HDR images of the inputs ($\mathbf{H}_\text{r}$ and $\mathbf{H}_\text{nr}$), the structure tensor of the reference frame ($\mathbf{S}$), and the different mask between the two inputs ($\mathbf{D}$).
  • Figure 3: The reversed flat map of structure tensor has clearer textures and is more robust to noise.
  • Figure 4: The overall framework of (a) DiHDR and (b) ToneNet. (c) Feature Encoder transforms the inputs and their side information into features using SCAT. (d) Multi-scale structure tensor priors are injected into the network to guide the structure of the generated image persistently. (e) In SCAT, additional prior features from structure tensor or difference mask are introduced into the transformer to capture cross attention. For SCAT blocks without prior inputs, the red dashed flows are masked.
  • Figure 5: Tonemapping comparisons.
  • ...and 7 more figures