UMCFuse: A Unified Multiple Complex Scenes Infrared and Visible Image Fusion Framework
Xilai Li, Xiaosong Li, Tianshu Tan, Huafeng Li, Tao Ye
TL;DR
UMCFuse addresses infrared-visible image fusion in complex scenes by introducing a transmission-map guided decomposition that separates interference from content, followed by high-/low-frequency fusion. It leverages a Scale-Aware Noise Suppression Filter (SANF), Monogenic Phase Consistency (MPC) for high-frequency feature fusion, and a multi-directional, Gabor-based approach for low-frequency fusion, yielding a final result $F = FH + FL$. Extensive experiments on real and synthetic complex scenes show superior fusion quality across multiple metrics and improved performance in downstream tasks such as semantic segmentation, object detection, salient object detection, and depth estimation, with demonstrated cross-domain applicability to medical image fusion. The work provides a practical, single-framework solution for robust IVIF in adverse conditions and releases the code for reproducibility.
Abstract
Infrared and visible image fusion has emerged as a prominent research area in computer vision. However, little attention has been paid to the fusion task in complex scenes, leading to sub-optimal results under interference. To fill this gap, we propose a unified framework for infrared and visible images fusion in complex scenes, termed UMCFuse. Specifically, we classify the pixels of visible images from the degree of scattering of light transmission, allowing us to separate fine details from overall intensity. Maintaining a balance between interference removal and detail preservation is essential for the generalization capacity of the proposed method. Therefore, we propose an adaptive denoising strategy for the fusion of detail layers. Meanwhile, we fuse the energy features from different modalities by analyzing them from multiple directions. Extensive fusion experiments on real and synthetic complex scenes datasets cover adverse weather conditions, noise, blur, overexposure, fire, as well as downstream tasks including semantic segmentation, object detection, salient object detection, and depth estimation, consistently indicate the superiority of the proposed method compared with the recent representative methods. Our code is available at https://github.com/ixilai/UMCFuse.
