Table of Contents
Fetching ...

FUSION: Frequency-guided Underwater Spatial Image recOnstructioN

Jaskaran Singh Walia, Shravan Venkatraman, Pavithra LK

TL;DR

Underwater images suffer color casts and low visibility due to wavelength-dependent attenuation and scattering. The authors propose FUSION, a dual-domain framework that processes RGB channels with multi-scale spatial convolutions and FFT-based frequency attention, followed by a Frequency Guided Fusion and inter-channel calibration to restore color balance. They demonstrate state-of-the-art reconstruction fidelity and perceptual quality on UIEB, EUVP, and SUIM-E with a compact 0.28M parameter footprint and modest GFLOPs, enabling real-time deployment on autonomous underwater platforms. By explicitly integrating spatial and spectral information and preserving phase in the frequency path, the method addresses nonuniform attenuation and long-range dependencies that challenge traditional UIE approaches.$I( abla) = I_0( abla) e^{-eta( abla)d}$ illustrates the underlying wavelength-dependent degradation considered by the model.

Abstract

Underwater images suffer from severe degradations, including color distortions, reduced visibility, and loss of structural details due to wavelength-dependent attenuation and scattering. Existing enhancement methods primarily focus on spatial-domain processing, neglecting the frequency domain's potential to capture global color distributions and long-range dependencies. To address these limitations, we propose FUSION, a dual-domain deep learning framework that jointly leverages spatial and frequency domain information. FUSION independently processes each RGB channel through multi-scale convolutional kernels and adaptive attention mechanisms in the spatial domain, while simultaneously extracting global structural information via FFT-based frequency attention. A Frequency Guided Fusion module integrates complementary features from both domains, followed by inter-channel fusion and adaptive channel recalibration to ensure balanced color distributions. Extensive experiments on benchmark datasets (UIEB, EUVP, SUIM-E) demonstrate that FUSION achieves state-of-the-art performance, consistently outperforming existing methods in reconstruction fidelity (highest PSNR of 23.717 dB and SSIM of 0.883 on UIEB), perceptual quality (lowest LPIPS of 0.112 on UIEB), and visual enhancement metrics (best UIQM of 3.414 on UIEB), while requiring significantly fewer parameters (0.28M) and lower computational complexity, demonstrating its suitability for real-time underwater imaging applications.

FUSION: Frequency-guided Underwater Spatial Image recOnstructioN

TL;DR

Underwater images suffer color casts and low visibility due to wavelength-dependent attenuation and scattering. The authors propose FUSION, a dual-domain framework that processes RGB channels with multi-scale spatial convolutions and FFT-based frequency attention, followed by a Frequency Guided Fusion and inter-channel calibration to restore color balance. They demonstrate state-of-the-art reconstruction fidelity and perceptual quality on UIEB, EUVP, and SUIM-E with a compact 0.28M parameter footprint and modest GFLOPs, enabling real-time deployment on autonomous underwater platforms. By explicitly integrating spatial and spectral information and preserving phase in the frequency path, the method addresses nonuniform attenuation and long-range dependencies that challenge traditional UIE approaches. illustrates the underlying wavelength-dependent degradation considered by the model.

Abstract

Underwater images suffer from severe degradations, including color distortions, reduced visibility, and loss of structural details due to wavelength-dependent attenuation and scattering. Existing enhancement methods primarily focus on spatial-domain processing, neglecting the frequency domain's potential to capture global color distributions and long-range dependencies. To address these limitations, we propose FUSION, a dual-domain deep learning framework that jointly leverages spatial and frequency domain information. FUSION independently processes each RGB channel through multi-scale convolutional kernels and adaptive attention mechanisms in the spatial domain, while simultaneously extracting global structural information via FFT-based frequency attention. A Frequency Guided Fusion module integrates complementary features from both domains, followed by inter-channel fusion and adaptive channel recalibration to ensure balanced color distributions. Extensive experiments on benchmark datasets (UIEB, EUVP, SUIM-E) demonstrate that FUSION achieves state-of-the-art performance, consistently outperforming existing methods in reconstruction fidelity (highest PSNR of 23.717 dB and SSIM of 0.883 on UIEB), perceptual quality (lowest LPIPS of 0.112 on UIEB), and visual enhancement metrics (best UIQM of 3.414 on UIEB), while requiring significantly fewer parameters (0.28M) and lower computational complexity, demonstrating its suitability for real-time underwater imaging applications.

Paper Structure

This paper contains 15 sections, 40 equations, 17 figures, 8 tables.

Figures (17)

  • Figure 1: Overview of our proposed FUSION architecture for UIE. The model takes a degraded underwater image as input and restores it with enhanced visual quality.
  • Figure 2: Architecture of the CBAM block cbam
  • Figure 3: Visual comparisons on the UIEB dataset.
  • Figure 4: Visual comparisons on the EUVP dataset.
  • Figure 5: Bubble chart comparing the trade-off between average PSNR and GFLOPs for various UIE models
  • ...and 12 more figures