Table of Contents
Fetching ...

ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss via Meta-Learning

Haowen Bai, Zixiang Zhao, Jiangshe Zhang, Yichen Wu, Lilun Deng, Yukun Cui, Shuang Xu, Baisong Jiang

TL;DR

ReFusion tackles the lack of ground truth and inflexible loss design in image fusion by learning a task-adaptive fusion loss through a Loss Proposal Module trained with reconstruction supervision. The framework combines a lightweight fusion module, a source reconstruction module, and a meta-learned loss proposer, with a three-stage training cycle (inner, outer, and fusion-reconstruction updates) that tailors the loss to IVIF, MIF, MFIF, and MEIF. Key contributions include a parameterized, learnable fusion loss, a reconstruction-guided meta-learning strategy, and a lightweight architecture that achieves state-of-the-art performance across multiple fusion tasks while remaining parameter-efficient. The approach demonstrates strong empirical results and broad applicability, offering practical impact for multimodal fusion in medical imaging, remote sensing, and photography under diverse conditions.

Abstract

Image fusion aims to combine information from multiple source images into a single one with more comprehensive informational content. Deep learning-based image fusion algorithms face significant challenges, including the lack of a definitive ground truth and the corresponding distance measurement. Additionally, current manually defined loss functions limit the model's flexibility and generalizability for various fusion tasks. To address these limitations, we propose ReFusion, a unified meta-learning based image fusion framework that dynamically optimizes the fusion loss for various tasks through source image reconstruction. Compared to existing methods, ReFusion employs a parameterized loss function, that allows the training framework to be dynamically adapted according to the specific fusion scenario and task. ReFusion consists of three key components: a fusion module, a source reconstruction module, and a loss proposal module. We employ a meta-learning strategy to train the loss proposal module using the reconstruction loss. This strategy forces the fused image to be more conducive to reconstruct source images, allowing the loss proposal module to generate a adaptive fusion loss that preserves the optimal information from the source images. The update of the fusion module relies on the learnable fusion loss proposed by the loss proposal module. The three modules update alternately, enhancing each other to optimize the fusion loss for different tasks and consistently achieve satisfactory results. Extensive experiments demonstrate that ReFusion is capable of adapting to various tasks, including infrared-visible, medical, multi-focus, and multi-exposure image fusion.

ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss via Meta-Learning

TL;DR

ReFusion tackles the lack of ground truth and inflexible loss design in image fusion by learning a task-adaptive fusion loss through a Loss Proposal Module trained with reconstruction supervision. The framework combines a lightweight fusion module, a source reconstruction module, and a meta-learned loss proposer, with a three-stage training cycle (inner, outer, and fusion-reconstruction updates) that tailors the loss to IVIF, MIF, MFIF, and MEIF. Key contributions include a parameterized, learnable fusion loss, a reconstruction-guided meta-learning strategy, and a lightweight architecture that achieves state-of-the-art performance across multiple fusion tasks while remaining parameter-efficient. The approach demonstrates strong empirical results and broad applicability, offering practical impact for multimodal fusion in medical imaging, remote sensing, and photography under diverse conditions.

Abstract

Image fusion aims to combine information from multiple source images into a single one with more comprehensive informational content. Deep learning-based image fusion algorithms face significant challenges, including the lack of a definitive ground truth and the corresponding distance measurement. Additionally, current manually defined loss functions limit the model's flexibility and generalizability for various fusion tasks. To address these limitations, we propose ReFusion, a unified meta-learning based image fusion framework that dynamically optimizes the fusion loss for various tasks through source image reconstruction. Compared to existing methods, ReFusion employs a parameterized loss function, that allows the training framework to be dynamically adapted according to the specific fusion scenario and task. ReFusion consists of three key components: a fusion module, a source reconstruction module, and a loss proposal module. We employ a meta-learning strategy to train the loss proposal module using the reconstruction loss. This strategy forces the fused image to be more conducive to reconstruct source images, allowing the loss proposal module to generate a adaptive fusion loss that preserves the optimal information from the source images. The update of the fusion module relies on the learnable fusion loss proposed by the loss proposal module. The three modules update alternately, enhancing each other to optimize the fusion loss for different tasks and consistently achieve satisfactory results. Extensive experiments demonstrate that ReFusion is capable of adapting to various tasks, including infrared-visible, medical, multi-focus, and multi-exposure image fusion.
Paper Structure (25 sections, 10 equations, 14 figures, 7 tables, 1 algorithm)

This paper contains 25 sections, 10 equations, 14 figures, 7 tables, 1 algorithm.

Figures (14)

  • Figure 1: Comparison of the proposed ReFusion framework with existing methods, such as DBLP:journals/ijcv/ZhangM21. (a) In previous fusion methods, reconstruction loss supports the fusion loss during the training of the fusion module. (b) In ReFusion, reconstruction loss is employed to supervise the fusion loss proposal module in a meta-learning manner. This supervision ensures that the fused images more effectively reconstruct the source images, enabling the loss proposal module to propose a fusion loss function that better preserves the source image information. (c) The parameters of the learnable fusion loss, output by the loss proposal module, where $w_{a}^{ij}+w_{b}^{ij}=1$ and $v_{a}^{ij}+v_{b}^{ij}=1$. The fusion loss is learned by supervising the loss proposal module, which dynamically adjusts the parameters based on the specific requirements of the fusion task.
  • Figure 2: Workflow illustration of ReFusion. The alternating three stages are denoted by red, blue, and green. The inner update, denoted by red, attempts to update the $\mathcal{F}$ using the currently proposed fusion loss. The outer update, denoted by green, updates the $\mathcal{P}$ using the reconstruction loss of the meta-test set. The fusion and reconstruction update stage, denoted by blue, optimizes the $\mathcal{F}$ and $\mathcal{R}$ using $\mathcal{L}_f$ and $\mathcal{L}_r$.
  • Figure 3: The structure of the Adaptive Fusion Module (AFM), which consists cross-attention interactive feature extraction and gating features refine.
  • Figure 4: Visual comparison for "00710N" and "00906N" in MSRS dataset for IVIF.
  • Figure 5: Visualization of the MRI-PET case in Harvard dataset for MIF.
  • ...and 9 more figures