Table of Contents
Fetching ...

CHITNet: A Complementary to Harmonious Information Transfer Network for Infrared and Visible Image Fusion

Keying Du, Huafeng Li, Yafei Zhang, Zhengtao Yu

TL;DR

A complementary to harmonious information transfer network (CHITNet) is proposed, which reasonably transfers complementary information into the harmonious one, which integrates both the shared and complementary features from two modalities.

Abstract

Current infrared and visible image fusion (IVIF) methods go to great lengths to excavate complementary features and design complex fusion strategies, which is extremely challenging. To this end, we rethink the IVIF outside the box, proposing a complementary to harmonious information transfer network (CHITNet). It reasonably transfers complementary information into harmonious one, which integrates both the shared and complementary features from two modalities. Specifically, to skillfully sidestep aggregating complementary information in IVIF, we design a mutual information transfer (MIT) module to mutually represent features from two modalities, roughly transferring complementary information into harmonious one. Then, a harmonious information acquisition supervised by source image (HIASSI) module is devised to further ensure the complementary to harmonious information transfer after MIT. Meanwhile, we also propose a structure information preservation (SIP) module to guarantee that the edge structure information of the source images can be transferred to the fusion results. Moreover, a mutual promotion training paradigm with interaction loss is adopted to facilitate better collaboration among MIT, HIASSI and SIP. In this way, the proposed method is able to generate fused images with higher qualities. Extensive experimental results demonstrate the superiority of CHITNet over state-of-the-art algorithms in terms of visual quality and quantitative evaluations.

CHITNet: A Complementary to Harmonious Information Transfer Network for Infrared and Visible Image Fusion

TL;DR

A complementary to harmonious information transfer network (CHITNet) is proposed, which reasonably transfers complementary information into the harmonious one, which integrates both the shared and complementary features from two modalities.

Abstract

Current infrared and visible image fusion (IVIF) methods go to great lengths to excavate complementary features and design complex fusion strategies, which is extremely challenging. To this end, we rethink the IVIF outside the box, proposing a complementary to harmonious information transfer network (CHITNet). It reasonably transfers complementary information into harmonious one, which integrates both the shared and complementary features from two modalities. Specifically, to skillfully sidestep aggregating complementary information in IVIF, we design a mutual information transfer (MIT) module to mutually represent features from two modalities, roughly transferring complementary information into harmonious one. Then, a harmonious information acquisition supervised by source image (HIASSI) module is devised to further ensure the complementary to harmonious information transfer after MIT. Meanwhile, we also propose a structure information preservation (SIP) module to guarantee that the edge structure information of the source images can be transferred to the fusion results. Moreover, a mutual promotion training paradigm with interaction loss is adopted to facilitate better collaboration among MIT, HIASSI and SIP. In this way, the proposed method is able to generate fused images with higher qualities. Extensive experimental results demonstrate the superiority of CHITNet over state-of-the-art algorithms in terms of visual quality and quantitative evaluations.
Paper Structure (32 sections, 21 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 32 sections, 21 equations, 12 figures, 7 tables, 1 algorithm.

Figures (12)

  • Figure 1: The core idea of our proposed CHITNet. Harmonious feature indicates the more comprehensive ones that cover shared information of infrared and visible images and modality-specific information from both two modal source images.
  • Figure 2: Overall architecture of the proposed method. In phaseM, the input concatenated infrared and visible image pairs $\bm I_{ir}^{\prime}$ and $\bm I_{vis}^{\prime}$ are fed into the infrared-visible feature encoder $\bm E_{ir}$ and $\bm E_{vis}$ respectively to obtain features $\bm F_{ir}$ and $\bm F_{vis}$. MIT is performed on $\bm F_{ir}$ and $\bm F_{vis}$ to achieve basic complementary to harmonious information transfer, getting transferred features $\bm F_{vis \leftrightarrow ir}$ and $\bm F_{ir \leftrightarrow vis}$. In phaseS, we then send $\bm F_{vis \leftrightarrow ir}$ and $\bm F_{ir \leftrightarrow vis}$ to SIPHIA and attain $\bm F_{ir}^{ed}$, $\bm F_{ir}^{en}$, $\bm F_{vis}^{ed}$, and $\bm F_{vis}^{en}$, ensuring successful information transfer and effectively preserving structure information. Finally, $\bm F_{vis \leftrightarrow ir}$, $\bm F_{ir}^{ed}$, $\bm F_{ir}^{en}$, $\bm F_{ir \leftrightarrow vis}$, $\bm F_{vis}^{ed}$, and $\bm F_{vis}^{en}$ are concatenated together and sent to the decoder $\bm D_{fuse}$ to obtain the final fusion result $\bm I_{fused}$.
  • Figure 3: Vision quality comparison of the ablation study on important idea and modules. From left to right, infrared image, visible image, and the results of W/O MIT, SIPHIA, MPTP, and our CHITNet.
  • Figure 4: Infrared and visible feature maps before and after mutual information transfer.
  • Figure 5: Vision quality comparison of our method with seven SOTA fusion methods on #FLIR_00288 and #FLIR_08835 images from the RoadScene dataset.
  • ...and 7 more figures