FUSION: Frequency-guided Underwater Spatial Image recOnstructioN
Jaskaran Singh Walia, Shravan Venkatraman, Pavithra LK
TL;DR
Underwater images suffer color casts and low visibility due to wavelength-dependent attenuation and scattering. The authors propose FUSION, a dual-domain framework that processes RGB channels with multi-scale spatial convolutions and FFT-based frequency attention, followed by a Frequency Guided Fusion and inter-channel calibration to restore color balance. They demonstrate state-of-the-art reconstruction fidelity and perceptual quality on UIEB, EUVP, and SUIM-E with a compact 0.28M parameter footprint and modest GFLOPs, enabling real-time deployment on autonomous underwater platforms. By explicitly integrating spatial and spectral information and preserving phase in the frequency path, the method addresses nonuniform attenuation and long-range dependencies that challenge traditional UIE approaches.$I( abla) = I_0( abla) e^{-eta( abla)d}$ illustrates the underlying wavelength-dependent degradation considered by the model.
Abstract
Underwater images suffer from severe degradations, including color distortions, reduced visibility, and loss of structural details due to wavelength-dependent attenuation and scattering. Existing enhancement methods primarily focus on spatial-domain processing, neglecting the frequency domain's potential to capture global color distributions and long-range dependencies. To address these limitations, we propose FUSION, a dual-domain deep learning framework that jointly leverages spatial and frequency domain information. FUSION independently processes each RGB channel through multi-scale convolutional kernels and adaptive attention mechanisms in the spatial domain, while simultaneously extracting global structural information via FFT-based frequency attention. A Frequency Guided Fusion module integrates complementary features from both domains, followed by inter-channel fusion and adaptive channel recalibration to ensure balanced color distributions. Extensive experiments on benchmark datasets (UIEB, EUVP, SUIM-E) demonstrate that FUSION achieves state-of-the-art performance, consistently outperforming existing methods in reconstruction fidelity (highest PSNR of 23.717 dB and SSIM of 0.883 on UIEB), perceptual quality (lowest LPIPS of 0.112 on UIEB), and visual enhancement metrics (best UIQM of 3.414 on UIEB), while requiring significantly fewer parameters (0.28M) and lower computational complexity, demonstrating its suitability for real-time underwater imaging applications.
