Table of Contents
Fetching ...

Infrared-Assisted Single-Stage Framework for Joint Restoration and Fusion of Visible and Infrared Images under Hazy Conditions

Huafeng Li, Jiaqi Fang, Yafei Zhang, Yu Liu

TL;DR

This work tackles haze-affected infrared-visible image fusion by introducing a single-stage framework that jointly restores and fuses IR-VIS data. It integrates an infrared-assisted feature restoration module (IA-FRM), a prompt generation module (PGM), and a multi-stage prompt embedding fusion module (MsPE-FM) to bridge modality gaps and enhance dehazing. The method employs haze-density driven feature augmentation, Restormer-based encoding, and a learnable prompt pool to produce adaptive, content-aware prompts that guide fusion across multiple stages, optimized with a combined loss including $\ell_{int}$, $\ell_\nabla$, and $\alpha \ell_1$. Experimental results on MSRS, M$^3$FD, and RoadScene demonstrate superior or competitive performance with lower complexity compared to two-stage pipelines and several state-of-the-art fusion methods, highlighting practical potential for hazy scene analysis and surveillance applications.

Abstract

Infrared and visible (IR-VIS) image fusion has gained significant attention for its broad application value. However, existing methods often neglect the complementary role of infrared image in restoring visible image features under hazy conditions. To address this, we propose a joint learning framework that utilizes infrared image for the restoration and fusion of hazy IR-VIS images. To mitigate the adverse effects of feature diversity between IR-VIS images, we introduce a prompt generation mechanism that regulates modality-specific feature incompatibility. This creates a prompt selection matrix from non-shared image information, followed by prompt embeddings generated from a prompt pool. These embeddings help generate candidate features for dehazing. We further design an infrared-assisted feature restoration mechanism that selects candidate features based on haze density, enabling simultaneous restoration and fusion within a single-stage framework. To enhance fusion quality, we construct a multi-stage prompt embedding fusion module that leverages feature supplementation from the prompt generation module. Our method effectively fuses IR-VIS images while removing haze, yielding clear, haze-free fusion results. In contrast to two-stage methods that dehaze and then fuse, our approach enables collaborative training in a single-stage framework, making the model relatively lightweight and suitable for practical deployment. Experimental results validate its effectiveness and demonstrate advantages over existing methods. The source code of the paper is available at \href{https://github.com/fangjiaqi0909/IASSF}{\textcolor{blue}{https://github.com/fangjiaqi0909/IASSF

Infrared-Assisted Single-Stage Framework for Joint Restoration and Fusion of Visible and Infrared Images under Hazy Conditions

TL;DR

This work tackles haze-affected infrared-visible image fusion by introducing a single-stage framework that jointly restores and fuses IR-VIS data. It integrates an infrared-assisted feature restoration module (IA-FRM), a prompt generation module (PGM), and a multi-stage prompt embedding fusion module (MsPE-FM) to bridge modality gaps and enhance dehazing. The method employs haze-density driven feature augmentation, Restormer-based encoding, and a learnable prompt pool to produce adaptive, content-aware prompts that guide fusion across multiple stages, optimized with a combined loss including , , and . Experimental results on MSRS, MFD, and RoadScene demonstrate superior or competitive performance with lower complexity compared to two-stage pipelines and several state-of-the-art fusion methods, highlighting practical potential for hazy scene analysis and surveillance applications.

Abstract

Infrared and visible (IR-VIS) image fusion has gained significant attention for its broad application value. However, existing methods often neglect the complementary role of infrared image in restoring visible image features under hazy conditions. To address this, we propose a joint learning framework that utilizes infrared image for the restoration and fusion of hazy IR-VIS images. To mitigate the adverse effects of feature diversity between IR-VIS images, we introduce a prompt generation mechanism that regulates modality-specific feature incompatibility. This creates a prompt selection matrix from non-shared image information, followed by prompt embeddings generated from a prompt pool. These embeddings help generate candidate features for dehazing. We further design an infrared-assisted feature restoration mechanism that selects candidate features based on haze density, enabling simultaneous restoration and fusion within a single-stage framework. To enhance fusion quality, we construct a multi-stage prompt embedding fusion module that leverages feature supplementation from the prompt generation module. Our method effectively fuses IR-VIS images while removing haze, yielding clear, haze-free fusion results. In contrast to two-stage methods that dehaze and then fuse, our approach enables collaborative training in a single-stage framework, making the model relatively lightweight and suitable for practical deployment. Experimental results validate its effectiveness and demonstrate advantages over existing methods. The source code of the paper is available at \href{https://github.com/fangjiaqi0909/IASSF}{\textcolor{blue}{https://github.com/fangjiaqi0909/IASSF

Paper Structure

This paper contains 18 sections, 13 equations, 12 figures, 10 tables.

Figures (12)

  • Figure 1: Comparison of existing method and our method for hazy IR-VIS image fusion. (a) The existing method, (b) Our method.
  • Figure 2: Overall framework of the proposed method. The input IR and hazy VIS image pair $\left\{{{\bm{I}}_{ir}}, {{\bm{I}}_{vi}} \right\}$ is processed by the PGM to obtain features $\left\{ {{\bm{F}}_{ir}},{{\bm{F}}_{vi}} \right\}$ and a prompt ${{\bm{\hat{P}}}_{ir}}$ for ${{\bm{F}}_{ir}}$. Through the PEB, the prompt embedding ${{\bm{\hat{P}}}_{ir}}$ is used to refine the IR feature ${{\bm{F}}_{ir}}$, reducing redundant information and generating the refined IR feature ${{\hat{\bm{F}}_{ir}}}$. The haze density esitimation (HDE) module 34 estimates the haze density in the VIS features to dynamically adjust the proportion of injected IR information, preventing excessive IR injection. The Transformer block removes degradation from the input features to obtain haze-free features. In the MsPE-FM, the haze-free VIS features and IR features are combined and passed to the Fusion Block for feature fusion. The PGM and PEB further are used to enhance the IR-VIS complementary information, reconstructing the final fused image.
  • Figure 3: (a)Illustration of the Atmospheric Light Estimation (ALE) module.(b)Illustration of the Get Haze Density (GHD) module.
  • Figure 4: Visual comparison of fusion results from different methods on the MSRS dataset. In the fusion results generated by the comparison methods, except for the last column, the first row of each image pair represents the result of first dehazing with DIACMP and then fusing. The second row shows the result of dehazing with Dehazeformer and then fusing. The last column represents the result of fusion using Text-IF (first row) and our method (second row).
  • Figure 5: Visual comparison of fusion results from different methods on the M$^3$FD dataset. In the fusion results generated by the comparison methods, except for the last column, the first row of each image pair represents the result of first dehazing with DIACMP and then fusing. The second row shows the result of dehazing with Dehazeformer and then fusing. The last column represents the result of fusion using Text-IF (first row) and our method (second row).
  • ...and 7 more figures