Table of Contents
Fetching ...

JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

Yuhui Wu, Guoqing Wang, Zhiwen Wang, Yang Yang, Tianyu Li, Malu Zhang, Chongyi Li, Heng Tao Shen

TL;DR

JoReS-Diff tackles low-light image enhancement by integrating physical-retinex priors with semantic guidance into a diffusion model. It introduces Retinex-based condition learning and Retinex-conditioned refinement (RNet) to preserve color and details, and semantic priors via semantic attention to maintain structure and semantics. The method demonstrates strong, dataset-rich improvements over state-of-the-art LLIE methods and diffusion baselines, with extensive ablations validating the contributions of each component. This joint conditioning framework offers a practical pathway to robust LLIE and related image enhancement tasks using diffusion models.

Abstract

Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models. Despite the success of some conditional methods, previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy, resulting in suboptimal visual outcomes. In this study, we propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition to regulate the generating capabilities of the diffusion model. We first leverage pre-trained decomposition network to generate the Retinex prior, which is updated with better quality by an adjustment network and integrated into a refinement network to implement Retinex-based conditional generation at both feature- and image-levels. Moreover, the semantic prior is extracted from the input image with an off-the-shelf semantic segmentation model and incorporated through semantic attention layers. By treating Retinex- and semantic-based priors as the condition, JoReS-Diff presents a unique perspective for establishing an diffusion model for LLIE and similar image enhancement tasks. Extensive experiments validate the rationality and superiority of our approach.

JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement

TL;DR

JoReS-Diff tackles low-light image enhancement by integrating physical-retinex priors with semantic guidance into a diffusion model. It introduces Retinex-based condition learning and Retinex-conditioned refinement (RNet) to preserve color and details, and semantic priors via semantic attention to maintain structure and semantics. The method demonstrates strong, dataset-rich improvements over state-of-the-art LLIE methods and diffusion baselines, with extensive ablations validating the contributions of each component. This joint conditioning framework offers a practical pathway to robust LLIE and related image enhancement tasks using diffusion models.

Abstract

Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models. Despite the success of some conditional methods, previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy, resulting in suboptimal visual outcomes. In this study, we propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition to regulate the generating capabilities of the diffusion model. We first leverage pre-trained decomposition network to generate the Retinex prior, which is updated with better quality by an adjustment network and integrated into a refinement network to implement Retinex-based conditional generation at both feature- and image-levels. Moreover, the semantic prior is extracted from the input image with an off-the-shelf semantic segmentation model and incorporated through semantic attention layers. By treating Retinex- and semantic-based priors as the condition, JoReS-Diff presents a unique perspective for establishing an diffusion model for LLIE and similar image enhancement tasks. Extensive experiments validate the rationality and superiority of our approach.
Paper Structure (14 sections, 19 equations, 8 figures, 6 tables)

This paper contains 14 sections, 19 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Visual comparisons among recent DiffLL jiang2023waveletdifflow, PyDiff zhou2023pyramiddiffusionllie, and our JoReS-Diff on LOL-v2 dataset. Previous diffusion-based methods exhibit detail loss and color distortion. Our method properly maintains color constancy and generates realistic textures thanks to the introduction of the superior Retinex and semantic priors.
  • Figure 2: Comparison between Diff-Retinex yi2023diffretinex and ours. Diff-Retinex still relies on the original decomposition and multiplication process uses two diffusion models to process the decomposed maps, which are multiplied as output. Our method is end-to-end and uses both Retinex and semantic priors, which are integrated into one single diffusion model. We also propose RNet to fully exploit the Retinex prior.
  • Figure 3: Overview of our Retinex- and semantic-based conditional diffusion model (JoReS-Diff). (a) The introduction of Retinex prior contains two stages. In the learning stage, DNet provides the initial decomposed maps and ANet outputs reliable Retinex-based conditions; In the refinement stage, the conditions ${R_t}\!', {L_t}\!', F_t$ are incorporated into UNet and RNet through Retinex attention layers and F/IRCMs (detailed in \ref{['fig:RCM']}) to preserve the color and content. (b) The semantic prior $\mathbf{c}_{seg}$ is extracted by a pre-trained segmentation model and then integrated into UNet through semantic attention layers.
  • Figure 4: The architecture of the feature/image-level Retinex-conditioned modules (F/IRCM). FRCM inputs multi-scale features $F_t$ from ANet and obtains the optimized image feature $F_{\hat{\mathbf{x}}_0}^{\: '}$. Then, IRCM inputs the ${R_t}\!'$, ${L_t}\!'$ and the $F_{\hat{\mathbf{x}}_{0}}^{\: '}$ to refine the approximated $\hat{\mathbf{x}}_0$ a produce the final output ${\hat{\mathbf{x}}_0}'$.
  • Figure 5: Visual comparison of our JoReS-Diff and the compared LLIE methods on the LOL dataset.
  • ...and 3 more figures