Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion
Yu-Jie Liang, Zihan Cao, Liang-Jian Deng, Xiao Wu
TL;DR
This work tackles multispectral and hyperspectral image fusion (MHIF) by addressing the high-frequency information loss and limited global context of conventional implicit neural representations. It introduces FeINFN, a dual-domain framework that transforms latent codes into the Fourier domain and fuses them via a Spatial-Frequency Implicit Fusion Function (Spa-Fre IFF), complemented by a Spatial-Frequency Interactive Decoder (SFID) that employs a complex Gabor wavelet activation to promote robust cross-domain interaction. The method provides a theoretical basis for the Gabor activation's time-frequency tightness and demonstrates state-of-the-art performance on the CAVE and Harvard MHIF benchmarks, supported by comprehensive ablations of the spatial and Fourier components. The work suggests a generalizable approach for frequency-aware, implicit fusion in high-resolution image synthesis tasks, with code to be released on GitHub.
Abstract
Recently, implicit neural representations (INR) have made significant strides in various vision-related domains, providing a novel solution for Multispectral and Hyperspectral Image Fusion (MHIF) tasks. However, INR is prone to losing high-frequency information and is confined to the lack of global perceptual capabilities. To address these issues, this paper introduces a Fourier-enhanced Implicit Neural Fusion Network (FeINFN) specifically designed for MHIF task, targeting the following phenomena: The Fourier amplitudes of the HR-HSI latent code and LR-HSI are remarkably similar; however, their phases exhibit different patterns. In FeINFN, we innovatively propose a spatial and frequency implicit fusion function (Spa-Fre IFF), helping INR capture high-frequency information and expanding the receptive field. Besides, a new decoder employing a complex Gabor wavelet activation function, called Spatial-Frequency Interactive Decoder (SFID), is invented to enhance the interaction of INR features. Especially, we further theoretically prove that the Gabor wavelet activation possesses a time-frequency tightness property that favors learning the optimal bandwidths in the decoder. Experiments on two benchmark MHIF datasets verify the state-of-the-art (SOTA) performance of the proposed method, both visually and quantitatively. Also, ablation studies demonstrate the mentioned contributions. The code will be available on Anonymous GitHub (https://anonymous.4open.science/r/FeINFN-15C9/) after possible acceptance.
