Table of Contents
Fetching ...

ProDehaze: Prompting Diffusion Models Toward Faithful Image Dehazing

Tianwen Zhou, Jing Wang, Songtao Wu, Kuanhong Xu

TL;DR

ProDehaze tackles hallucination in diffusion-model-based image dehazing by injecting selective internal priors to guide external pretrained priors. It introduces two components: Structure-Prompted Restorer (SPR) in the latent space, which uses high-frequency structure cues via Haar DWT to emphasize structure-rich regions, and Haze-Aware Self-Correcting Refiner (HCR) in decoding, which employs a Dark Channel Prior-derived haze mask to modulate attention and promote distribution alignment between clearer input regions and the output. The approach demonstrates superior fidelity and reduced color shifts on synthetic and real-world hazy datasets, outperforming several diffusion-based and prompting methods without extensive real-world fine-tuning. Overall, ProDehaze validates internal priors as a powerful mechanism to guide pretrained diffusion models toward faithful restoration, with potential applicability to a broader class of inverse problems in vision.

Abstract

Recent approaches using large-scale pretrained diffusion models for image dehazing improve perceptual quality but often suffer from hallucination issues, producing unfaithful dehazed image to the original one. To mitigate this, we propose ProDehaze, a framework that employs internal image priors to direct external priors encoded in pretrained models. We introduce two types of \textit{selective} internal priors that prompt the model to concentrate on critical image areas: a Structure-Prompted Restorer in the latent space that emphasizes structure-rich regions, and a Haze-Aware Self-Correcting Refiner in the decoding process to align distributions between clearer input regions and the output. Extensive experiments on real-world datasets demonstrate that ProDehaze achieves high-fidelity results in image dehazing, particularly in reducing color shifts. Our code is at https://github.com/TianwenZhou/ProDehaze.

ProDehaze: Prompting Diffusion Models Toward Faithful Image Dehazing

TL;DR

ProDehaze tackles hallucination in diffusion-model-based image dehazing by injecting selective internal priors to guide external pretrained priors. It introduces two components: Structure-Prompted Restorer (SPR) in the latent space, which uses high-frequency structure cues via Haar DWT to emphasize structure-rich regions, and Haze-Aware Self-Correcting Refiner (HCR) in decoding, which employs a Dark Channel Prior-derived haze mask to modulate attention and promote distribution alignment between clearer input regions and the output. The approach demonstrates superior fidelity and reduced color shifts on synthetic and real-world hazy datasets, outperforming several diffusion-based and prompting methods without extensive real-world fine-tuning. Overall, ProDehaze validates internal priors as a powerful mechanism to guide pretrained diffusion models toward faithful restoration, with potential applicability to a broader class of inverse problems in vision.

Abstract

Recent approaches using large-scale pretrained diffusion models for image dehazing improve perceptual quality but often suffer from hallucination issues, producing unfaithful dehazed image to the original one. To mitigate this, we propose ProDehaze, a framework that employs internal image priors to direct external priors encoded in pretrained models. We introduce two types of \textit{selective} internal priors that prompt the model to concentrate on critical image areas: a Structure-Prompted Restorer in the latent space that emphasizes structure-rich regions, and a Haze-Aware Self-Correcting Refiner in the decoding process to align distributions between clearer input regions and the output. Extensive experiments on real-world datasets demonstrate that ProDehaze achieves high-fidelity results in image dehazing, particularly in reducing color shifts. Our code is at https://github.com/TianwenZhou/ProDehaze.

Paper Structure

This paper contains 15 sections, 9 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Framework of ProDehaze. We employ a two-phase finetuning strategy for faithful dehazing. In the first phase, we train the Structure-Prompted Restorer (SPR) in the latent space using a structural prompt generated by a Haar Feature Extractor (HFE) from the hazy input $x_{in}$. It is then concatenated with the latent representation of $x_{in}$ and injected into the trainable adapter $\mathcal{N}$ to provide structural guidance. In the second phase, we finetune the Haze-aware Self-Correcting Refiner (HCR) in the decoding process. The haze-aware prompt, initialized by Dark Channel Prior (DCP), produces a sparse mask $M_s$ that emphasizes the clearer areas in $x_{in}$. It is used to modulate the attention map of the window swin transformer (WST) in the decoder $\mathcal{D}$. Finally, $\mathcal{D}$ and the refine network are jointly trained for better alignment between the clearer regions in $x_{in}$ and the output $x_{r}$.
  • Figure 2: Qualitative comparison of dehazing results with different methods. From top to bottom are the results on I-Haze, O-Haze, DenseHaze and NhHaze.
  • Figure 3: Ablation study on the proposed method. (c) finetuning only the vanilla adapter $\mathcal{N}$. (d) finetuning SPR only. (e) finetuning both SPR and HCR, but without $M_s$ modulation (f) full setting.