DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion
Jinyuan Liu, Bowei Zhang, Qingyun Mei, Xingyuan Li, Yang Zou, Zhiying Jiang, Long Ma, Risheng Liu, Xin Fan
TL;DR
The paper addresses the challenge of jointly optimizing infrared-visible image fusion and downstream perception tasks. It introduces DCEvo, a framework that integrates a Discriminative Enhancer to emphasize object-centric features and a Cross-Dimensional Embedding to allow mutual supervision between high-dimensional task features and low-dimensional fusion features, all guided by an Evolutionary Algorithm that adaptively balances multiple objectives. Key contributions include modeling dual-task optimization as a multi-objective problem, learning evolutionary hyperparameters, and demonstrating improved visual quality as well as enhanced downstream detection and segmentation performance. The approach shows robust gains on multiple IVIF benchmarks and suggests a practical pathway for task-aware fusion in real-world intelligent systems, with code available for reproducibility.
Abstract
Infrared and visible image fusion integrates information from distinct spectral bands to enhance image quality by leveraging the strengths and mitigating the limitations of each modality. Existing approaches typically treat image fusion and subsequent high-level tasks as separate processes, resulting in fused images that offer only marginal gains in task performance and fail to provide constructive feedback for optimizing the fusion process. To overcome these limitations, we propose a Discriminative Cross-Dimension Evolutionary Learning Framework, termed DCEvo, which simultaneously enhances visual quality and perception accuracy. Leveraging the robust search capabilities of Evolutionary Learning, our approach formulates the optimization of dual tasks as a multi-objective problem by employing an Evolutionary Algorithm (EA) to dynamically balance loss function parameters. Inspired by visual neuroscience, we integrate a Discriminative Enhancer (DE) within both the encoder and decoder, enabling the effective learning of complementary features from different modalities. Additionally, our Cross-Dimensional Embedding (CDE) block facilitates mutual enhancement between high-dimensional task features and low-dimensional fusion features, ensuring a cohesive and efficient feature integration process. Experimental results on three benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches, achieving an average improvement of 9.32% in visual quality while also enhancing subsequent high-level tasks. The code is available at https://github.com/Beate-Suy-Zhang/DCEvo.
