Table of Contents
Fetching ...

Test-Time Dynamic Image Fusion

Bing Cao, Yinan Xia, Yi Ding, Changqing Zhang, Qinghua Hu

TL;DR

Theoretically, it is proved that the key to reducing generalization error hinges on the negative correlation between the RD-based fusion weight and the uni-source reconstruction loss, and it provably reduces the upper bound of generalization error.

Abstract

The inherent challenge of image fusion lies in capturing the correlation of multi-source images and comprehensively integrating effective information from different sources. Most existing techniques fail to perform dynamic image fusion while notably lacking theoretical guarantees, leading to potential deployment risks in this field. Is it possible to conduct dynamic image fusion with a clear theoretical justification? In this paper, we give our solution from a generalization perspective. We proceed to reveal the generalized form of image fusion and derive a new test-time dynamic image fusion paradigm. It provably reduces the upper bound of generalization error. Specifically, we decompose the fused image into multiple components corresponding to its source data. The decomposed components represent the effective information from the source data, thus the gap between them reflects the Relative Dominability (RD) of the uni-source data in constructing the fusion image. Theoretically, we prove that the key to reducing generalization error hinges on the negative correlation between the RD-based fusion weight and the uni-source reconstruction loss. Intuitively, RD dynamically highlights the dominant regions of each source and can be naturally converted to the corresponding fusion weight, achieving robust results. Extensive experiments and discussions with in-depth analysis on multiple benchmarks confirm our findings and superiority. Our code is available at https://github.com/Yinan-Xia/TTD.

Test-Time Dynamic Image Fusion

TL;DR

Theoretically, it is proved that the key to reducing generalization error hinges on the negative correlation between the RD-based fusion weight and the uni-source reconstruction loss, and it provably reduces the upper bound of generalization error.

Abstract

The inherent challenge of image fusion lies in capturing the correlation of multi-source images and comprehensively integrating effective information from different sources. Most existing techniques fail to perform dynamic image fusion while notably lacking theoretical guarantees, leading to potential deployment risks in this field. Is it possible to conduct dynamic image fusion with a clear theoretical justification? In this paper, we give our solution from a generalization perspective. We proceed to reveal the generalized form of image fusion and derive a new test-time dynamic image fusion paradigm. It provably reduces the upper bound of generalization error. Specifically, we decompose the fused image into multiple components corresponding to its source data. The decomposed components represent the effective information from the source data, thus the gap between them reflects the Relative Dominability (RD) of the uni-source data in constructing the fusion image. Theoretically, we prove that the key to reducing generalization error hinges on the negative correlation between the RD-based fusion weight and the uni-source reconstruction loss. Intuitively, RD dynamically highlights the dominant regions of each source and can be naturally converted to the corresponding fusion weight, achieving robust results. Extensive experiments and discussions with in-depth analysis on multiple benchmarks confirm our findings and superiority. Our code is available at https://github.com/Yinan-Xia/TTD.

Paper Structure

This paper contains 12 sections, 1 theorem, 7 equations, 5 figures, 2 tables.

Key Result

Theorem 3.1

(Decomposition of Generalization Error). The GError for multi-source image fusion model f can be decomposed into a linear combination of each uni-source component reconstruction loss under the condition that $\sum_{m=1}^M\omega^{(m)}=1$, the detailed proof is given in app:proof:

Figures (5)

  • Figure 1: We visualized the Relative Dominablity (RD) of each source on four tasks, which effectively highlights the dominance of uni-source in image fusion.
  • Figure 2: The framework of our TTD. Deriving from the generalization theory, we decompose fused images into uni-source components and find the key to reducing generalization error upper bound is the negative correlation between the fusion weight and reconstruction loss. Accordingly, we propose pixel-wise Relative Dominablity (RD) for each source, which is negatively correlation with the reconstruction loss and highlights the dominant regions of uni-source in constructing fusion images.
  • Figure 3: (a) On the VIF task, our TTD produces fused images that retain more multi-source information compared with existing approaches. (b) On the MIF task, our method improves the contrast of the fused image and preserves more details from the source image.
  • Figure 4: The comparison of fusion results on MEF and MFF tasks. (a) On the MFF task, our method retains the color and clarity of the original image better. (b) On the MEF task, our TTD ensures better detail preservation in varying lighting conditions.
  • Figure 5: (a) The visualization of RDs obtained by gradient maps of different channels. The 44th gradient map provides wrong dominance information, and the 13th gradient map offers insignificant information, while the 58th gradient map performs the proper advantages of the two source images. (b) The radar chart of the gradient-based RD experiment (upper) and the validation of the negative correlation (below).

Theorems & Definitions (1)

  • Theorem 3.1