Table of Contents
Fetching ...

DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance

Linfeng Tang, Chunyu Li, Guoqing Wang, Yixuan Yuan, Jiayi Ma

TL;DR

This work addresses degraded infrared-visible image fusion by introducing DSPFusion, a dual-prior framework that leverages modality-specific degradation priors and a jointly derived semantic prior restored via a latent-space diffusion model. Stage I extracts priors and performs initial restoration and fusion with a Transformer-based network guided by dual priors, while Stage II uses a semantic-prior diffusion model to recover high-quality priors from degraded inputs, enabling fast, high-quality fusion. The approach unifies degradation suppression and information aggregation, supported by a contrastive loss for degradation priors and multiple content/structure/color losses, and it achieves strong performance across standard and degraded datasets with significantly reduced computation compared to image-domain diffusion methods. The results show DSPFusion outperforms state-of-the-art methods on multiple metrics, improves downstream tasks like object detection, and maintains practical efficiency, broadening the applicability of robust IVIF in real-world scenarios.

Abstract

Existing fusion methods are tailored for high-quality images but struggle with degraded images captured under harsh circumstances, thus limiting the practical potential of image fusion. This work presents a \textbf{D}egradation and \textbf{S}emantic \textbf{P}rior dual-guided framework for degraded image \textbf{Fusion} (\textbf{DSPFusion}), utilizing degradation priors and high-quality scene semantic priors restored via diffusion models to guide both information recovery and fusion in a unified model. In specific, it first individually extracts modality-specific degradation priors, while jointly capturing comprehensive low-quality semantic priors. Subsequently, a diffusion model is developed to iteratively restore high-quality semantic priors in a compact latent space, enabling our method to be over $20 \times$ faster than mainstream diffusion model-based image fusion schemes. Finally, the degradation priors and high-quality semantic priors are employed to guide information enhancement and aggregation via the dual-prior guidance and prior-guided fusion modules. Extensive experiments demonstrate that DSPFusion mitigates most typical degradations while integrating complementary context with minimal computational cost, greatly broadening the application scope of image fusion.

DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance

TL;DR

This work addresses degraded infrared-visible image fusion by introducing DSPFusion, a dual-prior framework that leverages modality-specific degradation priors and a jointly derived semantic prior restored via a latent-space diffusion model. Stage I extracts priors and performs initial restoration and fusion with a Transformer-based network guided by dual priors, while Stage II uses a semantic-prior diffusion model to recover high-quality priors from degraded inputs, enabling fast, high-quality fusion. The approach unifies degradation suppression and information aggregation, supported by a contrastive loss for degradation priors and multiple content/structure/color losses, and it achieves strong performance across standard and degraded datasets with significantly reduced computation compared to image-domain diffusion methods. The results show DSPFusion outperforms state-of-the-art methods on multiple metrics, improves downstream tasks like object detection, and maintains practical efficiency, broadening the applicability of robust IVIF in real-world scenarios.

Abstract

Existing fusion methods are tailored for high-quality images but struggle with degraded images captured under harsh circumstances, thus limiting the practical potential of image fusion. This work presents a \textbf{D}egradation and \textbf{S}emantic \textbf{P}rior dual-guided framework for degraded image \textbf{Fusion} (\textbf{DSPFusion}), utilizing degradation priors and high-quality scene semantic priors restored via diffusion models to guide both information recovery and fusion in a unified model. In specific, it first individually extracts modality-specific degradation priors, while jointly capturing comprehensive low-quality semantic priors. Subsequently, a diffusion model is developed to iteratively restore high-quality semantic priors in a compact latent space, enabling our method to be over faster than mainstream diffusion model-based image fusion schemes. Finally, the degradation priors and high-quality semantic priors are employed to guide information enhancement and aggregation via the dual-prior guidance and prior-guided fusion modules. Extensive experiments demonstrate that DSPFusion mitigates most typical degradations while integrating complementary context with minimal computational cost, greatly broadening the application scope of image fusion.

Paper Structure

This paper contains 18 sections, 17 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 1: The overall framework of our image fusion network with degradation and semantic dual-prior guidance.
  • Figure 2: Visualization of fusion results in different degraded scenarios with pre-enhancement.
  • Figure 3: t-SNE plots of the degradation embeddings.
  • Figure 4: Visual comparison of object detection.
  • Figure 5: Evaluation results with DepictQA.
  • ...and 7 more figures