Table of Contents
Fetching ...

SFDFusion: An Efficient Spatial-Frequency Domain Fusion Network for Infrared and Visible Image Fusion

Kun Hu, Qingle Zhang, Maoxun Yuan, Yitian Zhang

TL;DR

This paper proposes an efficient Spatial-Frequency Domain Fusion (SFDFusion) network for infrared and visible image fusion and constructs a Dual-Modality Refinement Module (DMRM) to extract complementary information and designs a frequency domain fusion loss to provide guidance for the fusion process.

Abstract

Infrared and visible image fusion aims to utilize the complementary information from two modalities to generate fused images with prominent targets and rich texture details. Most existing algorithms only perform pixel-level or feature-level fusion from different modalities in the spatial domain. They usually overlook the information in the frequency domain, and some of them suffer from inefficiency due to excessively complex structures. To tackle these challenges, this paper proposes an efficient Spatial-Frequency Domain Fusion (SFDFusion) network for infrared and visible image fusion. First, we propose a Dual-Modality Refinement Module (DMRM) to extract complementary information. This module extracts useful information from both the infrared and visible modalities in the spatial domain and enhances fine-grained spatial details. Next, to introduce frequency domain information, we construct a Frequency Domain Fusion Module (FDFM) that transforms the spatial domain to the frequency domain through Fast Fourier Transform (FFT) and then integrates frequency domain information. Additionally, we design a frequency domain fusion loss to provide guidance for the fusion process. Extensive experiments on public datasets demonstrate that our method produces fused images with significant advantages in various fusion metrics and visual effects. Furthermore, our method demonstrates high efficiency in image fusion and good performance on downstream detection tasks, thereby satisfying the real-time demands of advanced visual tasks.

SFDFusion: An Efficient Spatial-Frequency Domain Fusion Network for Infrared and Visible Image Fusion

TL;DR

This paper proposes an efficient Spatial-Frequency Domain Fusion (SFDFusion) network for infrared and visible image fusion and constructs a Dual-Modality Refinement Module (DMRM) to extract complementary information and designs a frequency domain fusion loss to provide guidance for the fusion process.

Abstract

Infrared and visible image fusion aims to utilize the complementary information from two modalities to generate fused images with prominent targets and rich texture details. Most existing algorithms only perform pixel-level or feature-level fusion from different modalities in the spatial domain. They usually overlook the information in the frequency domain, and some of them suffer from inefficiency due to excessively complex structures. To tackle these challenges, this paper proposes an efficient Spatial-Frequency Domain Fusion (SFDFusion) network for infrared and visible image fusion. First, we propose a Dual-Modality Refinement Module (DMRM) to extract complementary information. This module extracts useful information from both the infrared and visible modalities in the spatial domain and enhances fine-grained spatial details. Next, to introduce frequency domain information, we construct a Frequency Domain Fusion Module (FDFM) that transforms the spatial domain to the frequency domain through Fast Fourier Transform (FFT) and then integrates frequency domain information. Additionally, we design a frequency domain fusion loss to provide guidance for the fusion process. Extensive experiments on public datasets demonstrate that our method produces fused images with significant advantages in various fusion metrics and visual effects. Furthermore, our method demonstrates high efficiency in image fusion and good performance on downstream detection tasks, thereby satisfying the real-time demands of advanced visual tasks.

Paper Structure

This paper contains 17 sections, 12 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparison of different methods for combining frequency domains. (a) Only frequency domain processing. (b) Serial structure for spatial and frequency domain processing. (c) Our parallel fusion structure of spatial and frequency domains.
  • Figure 2: The overall architecture of the SFDFusion network. The network adopts a parallel dual-branch structure, dedicated to refining the spatial domain and integrating the frequency domain. The spatial domain branch consists of DMRM, while the frequency domain branch consists of FDFM. After concatenating the outputs of different branches along the channel dimension, the final fused image is obtained through several simple convolution operations.
  • Figure 3: Comparison of fusion results on the image "00024N" from the MSRS dataset using different methods.
  • Figure 4: A comparison of the detection performance after fusion on image "00548" from the M3FD dataset.
  • Figure 5: Visual comparison of ablation experiments.