Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion
Tianpei Zhang, Jufeng Zhao, Yiming Zhu, Guangmang Cui
TL;DR
This work tackles degraded infrared and visible image fusion by introducing VGDCFusion, a degradation aware framework that tightly couples degradation modeling with fusion using vision language model prompts. It comprises SPDCE, which enables intra modality degradation awareness and couples degradation suppression with feature extraction, and JPDCF, which facilitates cross modal degradation perception and integrates degradation filtering with cross modal feature fusion. The approach achieves superior performance under diverse degraded scenarios, validated by extensive experiments, ablations, and a downstream object detection task, demonstrating clear practical impact for robust IVIF in real world conditions. The authors also provide code to facilitate adoption and further research in degradation aware multi modal fusion.
Abstract
Existing Infrared and Visible Image Fusion (IVIF) methods typically assume high-quality inputs. However, when handing degraded images, these methods heavily rely on manually switching between different pre-processing techniques. This decoupling of degradation handling and image fusion leads to significant performance degradation. In this paper, we propose a novel VLM-Guided Degradation-Coupled Fusion network (VGDCFusion), which tightly couples degradation modeling with the fusion process and leverages vision-language models (VLMs) for degradation-aware perception and guided suppression. Specifically, the proposed Specific-Prompt Degradation-Coupled Extractor (SPDCE) enables modality-specific degradation awareness and establishes a joint modeling of degradation suppression and intra-modal feature extraction. In parallel, the Joint-Prompt Degradation-Coupled Fusion (JPDCF) facilitates cross-modal degradation perception and couples residual degradation filtering with complementary cross-modal feature fusion. Extensive experiments demonstrate that our VGDCFusion significantly outperforms existing state-of-the-art fusion approaches under various degraded image scenarios. Our code is available at https://github.com/Lmmh058/VGDCFusion.
