Table of Contents
Fetching ...

Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption

Jinyuan Liu, Guanyao Wu, Zhu Liu, Di Wang, Zhiying Jiang, Long Ma, Wei Zhong, Xin Fan, Risheng Liu

TL;DR

This survey comprehensively maps learning-based infrared-visible image fusion (IVIF) across data compatibility, fusion mechanisms, and downstream tasks. It introduces a multi-dimensional taxonomy, reviews AE/CNN/GAN/Transformer approaches, and discusses application-oriented methods for object detection and semantic segmentation, along with data-compatibility and attack considerations. The authors provide benchmarks, metrics, and a detailed performance summary, highlighting strengths and limitations while outlining future directions such as robust registration, better evaluation metrics, and lightweight designs. By analyzing over 180 methods and offering a lookup table of core ideas, the paper aims to guide practitioners and researchers toward more robust, task-aware, and efficient IVIF solutions with practical impact across surveillance, autonomous systems, and remote sensing.

Abstract

Infrared-visible image fusion (IVIF) is a critical task in computer vision, aimed at integrating the unique features of both infrared and visible spectra into a unified representation. Since 2018, the field has entered the deep learning era, with an increasing variety of approaches introducing a range of networks and loss functions to enhance visual performance. However, challenges such as data compatibility, perception accuracy, and efficiency remain. Unfortunately, there is a lack of recent comprehensive surveys that address this rapidly expanding domain. This paper fills that gap by providing a thorough survey covering a broad range of topics. We introduce a multi-dimensional framework to elucidate common learning-based IVIF methods, from visual enhancement strategies to data compatibility and task adaptability. We also present a detailed analysis of these approaches, accompanied by a lookup table clarifying their core ideas. Furthermore, we summarize performance comparisons, both quantitatively and qualitatively, focusing on registration, fusion, and subsequent high-level tasks. Beyond technical analysis, we discuss potential future directions and open issues in this area. For further details, visit our GitHub repository: https://github.com/RollingPlain/IVIF_ZOO.

Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption

TL;DR

This survey comprehensively maps learning-based infrared-visible image fusion (IVIF) across data compatibility, fusion mechanisms, and downstream tasks. It introduces a multi-dimensional taxonomy, reviews AE/CNN/GAN/Transformer approaches, and discusses application-oriented methods for object detection and semantic segmentation, along with data-compatibility and attack considerations. The authors provide benchmarks, metrics, and a detailed performance summary, highlighting strengths and limitations while outlining future directions such as robust registration, better evaluation metrics, and lightweight designs. By analyzing over 180 methods and offering a lookup table of core ideas, the paper aims to guide practitioners and researchers toward more robust, task-aware, and efficient IVIF solutions with practical impact across surveillance, autonomous systems, and remote sensing.

Abstract

Infrared-visible image fusion (IVIF) is a critical task in computer vision, aimed at integrating the unique features of both infrared and visible spectra into a unified representation. Since 2018, the field has entered the deep learning era, with an increasing variety of approaches introducing a range of networks and loss functions to enhance visual performance. However, challenges such as data compatibility, perception accuracy, and efficiency remain. Unfortunately, there is a lack of recent comprehensive surveys that address this rapidly expanding domain. This paper fills that gap by providing a thorough survey covering a broad range of topics. We introduce a multi-dimensional framework to elucidate common learning-based IVIF methods, from visual enhancement strategies to data compatibility and task adaptability. We also present a detailed analysis of these approaches, accompanied by a lookup table clarifying their core ideas. Furthermore, we summarize performance comparisons, both quantitatively and qualitatively, focusing on registration, fusion, and subsequent high-level tasks. Beyond technical analysis, we discuss potential future directions and open issues in this area. For further details, visit our GitHub repository: https://github.com/RollingPlain/IVIF_ZOO.
Paper Structure (47 sections, 12 figures, 7 tables)

This paper contains 47 sections, 12 figures, 7 tables.

Figures (12)

  • Figure 1: A detailed spectrogram depicting almost all wavelength and frequency ranges, particularly expanding the range of the human visual system and annotating corresponding computer vision and image fusion datasets lin2014microsoft.
  • Figure 2: A knowledge graph of our survey. We first understand IVIF from three different dimensions, and then we elaborate it on the fusion around seven specific aspects.
  • Figure 3: The diagram of infrared and visible image fusion for practical applications. Existing image fusion methods majorly focus on the design of architectures and training strategies for visual enhancement, few considering the adaptation for downstream visual perception tasks. Additionally, from the data compatibility perspective, pixel misalignment and adversarial attacks of image fusion are two major challenges. Additionally, integrating comprehensive semantic information for tasks like semantic segmentation, object detection, and salient object detection remains underexplored, posing a critical obstacle in image fusion.
  • Figure 4: A classification sankey diagram containing typical fusion methods.
  • Figure 5: The basic phased processes of AE / CNN / GAN / Transformer-based IVIF methods.
  • ...and 7 more figures