JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

Yuhui Wu; Guoqing Wang; Zhiwen Wang; Yang Yang; Tianyu Li; Malu Zhang; Chongyi Li; Heng Tao Shen

JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

Yuhui Wu, Guoqing Wang, Zhiwen Wang, Yang Yang, Tianyu Li, Malu Zhang, Chongyi Li, Heng Tao Shen

TL;DR

JoReS-Diff tackles low-light image enhancement by integrating physical-retinex priors with semantic guidance into a diffusion model. It introduces Retinex-based condition learning and Retinex-conditioned refinement (RNet) to preserve color and details, and semantic priors via semantic attention to maintain structure and semantics. The method demonstrates strong, dataset-rich improvements over state-of-the-art LLIE methods and diffusion baselines, with extensive ablations validating the contributions of each component. This joint conditioning framework offers a practical pathway to robust LLIE and related image enhancement tasks using diffusion models.

Abstract

Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models. Despite the success of some conditional methods, previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy, resulting in suboptimal visual outcomes. In this study, we propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition to regulate the generating capabilities of the diffusion model. We first leverage pre-trained decomposition network to generate the Retinex prior, which is updated with better quality by an adjustment network and integrated into a refinement network to implement Retinex-based conditional generation at both feature- and image-levels. Moreover, the semantic prior is extracted from the input image with an off-the-shelf semantic segmentation model and incorporated through semantic attention layers. By treating Retinex- and semantic-based priors as the condition, JoReS-Diff presents a unique perspective for establishing an diffusion model for LLIE and similar image enhancement tasks. Extensive experiments validate the rationality and superiority of our approach.

JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

TL;DR

Abstract

Paper Structure (14 sections, 19 equations, 8 figures, 6 tables)

This paper contains 14 sections, 19 equations, 8 figures, 6 tables.

Introduction
Related Work
Method
Conditional Denoising Diffusion Model
Retinex Prior Incorporation
Retinex-based Condition Learning
Retinex-conditioned Refinement
Semantic Prior Incorporation
Experiments
Experimental Settings
Quantitative Evaluation
Qualitative Evaluation
Ablation Study
Conclusion

Figures (8)

Figure 1: Visual comparisons among recent DiffLL jiang2023waveletdifflow, PyDiff zhou2023pyramiddiffusionllie, and our JoReS-Diff on LOL-v2 dataset. Previous diffusion-based methods exhibit detail loss and color distortion. Our method properly maintains color constancy and generates realistic textures thanks to the introduction of the superior Retinex and semantic priors.
Figure 2: Comparison between Diff-Retinex yi2023diffretinex and ours. Diff-Retinex still relies on the original decomposition and multiplication process uses two diffusion models to process the decomposed maps, which are multiplied as output. Our method is end-to-end and uses both Retinex and semantic priors, which are integrated into one single diffusion model. We also propose RNet to fully exploit the Retinex prior.
Figure 3: Overview of our Retinex- and semantic-based conditional diffusion model (JoReS-Diff). (a) The introduction of Retinex prior contains two stages. In the learning stage, DNet provides the initial decomposed maps and ANet outputs reliable Retinex-based conditions; In the refinement stage, the conditions ${R_t}\!', {L_t}\!', F_t$ are incorporated into UNet and RNet through Retinex attention layers and F/IRCMs (detailed in \ref{['fig:RCM']}) to preserve the color and content. (b) The semantic prior $\mathbf{c}_{seg}$ is extracted by a pre-trained segmentation model and then integrated into UNet through semantic attention layers.
Figure 4: The architecture of the feature/image-level Retinex-conditioned modules (F/IRCM). FRCM inputs multi-scale features $F_t$ from ANet and obtains the optimized image feature $F_{\hat{\mathbf{x}}_0}^{\: '}$. Then, IRCM inputs the ${R_t}\!'$, ${L_t}\!'$ and the $F_{\hat{\mathbf{x}}_{0}}^{\: '}$ to refine the approximated $\hat{\mathbf{x}}_0$ a produce the final output ${\hat{\mathbf{x}}_0}'$.
Figure 5: Visual comparison of our JoReS-Diff and the compared LLIE methods on the LOL dataset.
...and 3 more figures

JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

TL;DR

Abstract

JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

Authors

TL;DR

Abstract

Table of Contents

Figures (8)