GMODiff: One-Step Gain Map Refinement with Diffusion Priors for HDR Reconstruction
Tao Hu, Weiyu Zhou, Yanjie Tu, Peng Wu, Wei Dong, Qingsen Yan, Yanning Zhang
TL;DR
GMODiff reframes multi-exposure HDR reconstruction as conditional gain-map refinement, leveraging a degradation-aware regressor to initialize a one-step diffusion refinement guided by priors from LDR inputs. The two-stage approach uses DaReg to produce an initial gain map and reliability cues, then fine-tunes a latent diffusion model with a degradation-aware decoder to suppress artifacts while preserving structure. This yields superior perceptual and no-reference quality with significantly reduced inference time compared to prior diffusion-based HDR methods. The method demonstrates strong performance on real-world datasets and offers practical, fast HDR reconstruction suitable for real-time applications.
Abstract
Pre-trained Latent Diffusion Models (LDMs) have recently shown strong perceptual priors for low-level vision tasks, making them a promising direction for multi-exposure High Dynamic Range (HDR) reconstruction. However, directly applying LDMs to HDR remains challenging due to: (1) limited dynamic-range representation caused by 8-bit latent compression, (2) high inference cost from multi-step denoising, and (3) content hallucination inherent to generative nature. To address these challenges, we introduce GMODiff, a gain map-driven one-step diffusion framework for multi-exposure HDR reconstruction. Instead of reconstructing full HDR content, we reformulate HDR reconstruction as a conditionally guided Gain Map (GM) estimation task, where the GM encodes the extended dynamic range while retaining the same bit depth as LDR images. We initialize the denoising process from an informative regression-based estimate rather than pure noise, enabling the model to generate high-quality GMs in a single denoising step. Furthermore, recognizing that regression-based models excel in content fidelity while LDMs favor perceptual quality, we leverage regression priors to guide both the denoising process and latent decoding of the LDM, suppressing hallucinations while preserving structural accuracy. Extensive experiments demonstrate that our GMODiff performs favorably against several state-of-the-art methods and is 100 faster than previous LDM-based methods.
