Table of Contents
Fetching ...

ReDiffuse: Rotation Equivariant Diffusion Model for Multi-focus Image Fusion

Bo Li, Tingting Bao, Lingling Zhang, Weiping Fu, Yaxian Wang, Jun Liu

Abstract

Diffusion models have achieved impressive performance on multi-focus image fusion (MFIF). However, a key challenge in applying diffusion models to the ill-posed MFIF problem is that defocus blur can make common symmetric geometric structures (e.g., textures and edges) appear warped and deformed, often leading to unexpected artifacts in the fused images. Therefore, embedding rotation equivariance into diffusion networks is essential, as it enables the fusion results to faithfully preserve the original orientation and structural consistency of geometric patterns underlying the input images. Motivated by this, we propose ReDiffuse, a rotation-equivariant diffusion model for MFIF. Specifically, we carefully construct the basic diffusion architectures to achieve end-to-end rotation equivariance. We also provide a rigorous theoretical analysis to evaluate its intrinsic equivariance error, demonstrating the validity of embedding equivariance structures. ReDiffuse is comprehensively evaluated against various MFIF methods across four datasets (Lytro, MFFW, MFI-WHU, and Road-MF). Results demonstrate that ReDiffuse achieves competitive performance, with improvements of 0.28-6.64\% across six evaluation metrics. The code is available at https://github.com/MorvanLi/ReDiffuse.

ReDiffuse: Rotation Equivariant Diffusion Model for Multi-focus Image Fusion

Abstract

Diffusion models have achieved impressive performance on multi-focus image fusion (MFIF). However, a key challenge in applying diffusion models to the ill-posed MFIF problem is that defocus blur can make common symmetric geometric structures (e.g., textures and edges) appear warped and deformed, often leading to unexpected artifacts in the fused images. Therefore, embedding rotation equivariance into diffusion networks is essential, as it enables the fusion results to faithfully preserve the original orientation and structural consistency of geometric patterns underlying the input images. Motivated by this, we propose ReDiffuse, a rotation-equivariant diffusion model for MFIF. Specifically, we carefully construct the basic diffusion architectures to achieve end-to-end rotation equivariance. We also provide a rigorous theoretical analysis to evaluate its intrinsic equivariance error, demonstrating the validity of embedding equivariance structures. ReDiffuse is comprehensively evaluated against various MFIF methods across four datasets (Lytro, MFFW, MFI-WHU, and Road-MF). Results demonstrate that ReDiffuse achieves competitive performance, with improvements of 0.28-6.64\% across six evaluation metrics. The code is available at https://github.com/MorvanLi/ReDiffuse.
Paper Structure (21 sections, 6 theorems, 23 equations, 9 figures, 3 tables)

This paper contains 21 sections, 6 theorems, 23 equations, 9 figures, 3 tables.

Key Result

theorem 1

Assume that a feature map $F$ is discretized by the continuous function $e:\mathbb{R}^2\times S\to\mathbb{R}$, $|S|=m$, with mesh size $\delta$. If for any $R\in S$ and $x\in\mathbb{R}^2$, the following condition is satisfied: then for any $\tilde{R}\in S$, the following result holds:

Figures (9)

  • Figure 1: (a) is one defocused source image in MFIF containing typical geometric structures. (b) and (c) show fused results of the FusionDiff li2024fusiondiff and our proposed ReDiffuse, respectively. Even under defocus, rotation equivariance enables ReDiffuse to better preserve non-local directional similarity and local isotropic symmetry, providing clear visual evidence of its benefits for the MFIF task.
  • Figure 2: Fusion results generated by three diffusion fusion methods and our ReDiffuse. Our ReDiffuse achieves superior structural consistency and artifact suppression.
  • Figure 3: Four typical conventional regularization values for near-focus images from the Lytro dataset at different rotation angles. The circled regions show that many local structures at different orientations share similar geometric properties.
  • Figure 4: The MFIF framework based on the proposed ReDiffuse. Rot-E Conv denotes the B-Conv xie2025rotation. Due to efficient parameter sharing, each Rot-E convolution uses only $\tfrac{1}{m}$ of the parameters of a regular convolution, where $m$ denotes the equivariant group and is set to 4. As a result, ReDiffuse is lightweight, reducing the number of parameters from 26.91M to 7.55M. Moreover, under the $m=4$ rotation group, ReDiffuse theoretically achieves zero equivariance error (see Corollary \ref{['Corollary_1']}).
  • Figure 5: Illustration of the proposed theoretical results and their correspondence with the network modules.
  • ...and 4 more figures

Theorems & Definitions (6)

  • theorem 1
  • theorem 2
  • theorem 3
  • theorem 4
  • corollary 1
  • corollary 2