Table of Contents
Fetching ...

A Generative Data Framework with Authentic Supervision for Underwater Image Restoration and Enhancement

Yufeng Tian, Yifan Chen, Zhe Sun, Libang Chen, Mingyu Dou, Jijun Lu, Ye Zheng, Xuelong Li

TL;DR

This work tackles the data bottleneck in underwater image restoration by introducing a generative data framework that converts high-fidelity in-air images into six underwater degradation styles, creating large-scale, pixel-aligned paired datasets. It leverages unpaired image-to-image translation via CycleGAN-Turbo with a diffusion prior to generate two synthetic datasets, UWNature and UWImgNetSD, enabling robust supervised learning for color restoration and detail recovery. Across six network architectures and three independent test sets, models trained on the synthetic data consistently achieve superior color fidelity and cross-domain generalization compared to those trained on existing benchmarks. The approach offers a scalable, automated alternative to real underwater data collection, with public release of the synthetic datasets and potential extensions to vision-language integration and downstream tasks.

Abstract

Underwater image restoration and enhancement are crucial for correcting color distortion and restoring image details, thereby establishing a fundamental basis for subsequent underwater visual tasks. However, current deep learning methodologies in this area are frequently constrained by the scarcity of high-quality paired datasets. Since it is difficult to obtain pristine reference labels in underwater scenes, existing benchmarks often rely on manually selected results from enhancement algorithms, providing debatable reference images that lack globally consistent color and authentic supervision. This limits the model's capabilities in color restoration, image enhancement, and generalization. To overcome this limitation, we propose using in-air natural images as unambiguous reference targets and translating them into underwater-degraded versions, thereby constructing synthetic datasets that provide authentic supervision signals for model learning. Specifically, we establish a generative data framework based on unpaired image-to-image translation, producing a large-scale dataset that covers 6 representative underwater degradation types. The framework constructs synthetic datasets with precise ground-truth labels, which facilitate the learning of an accurate mapping from degraded underwater images to their pristine scene appearances. Extensive quantitative and qualitative experiments across 6 representative network architectures and 3 independent test sets show that models trained on our synthetic data achieve comparable or superior color restoration and generalization performance to those trained on existing benchmarks. This research provides a reliable and scalable data-driven solution for underwater image restoration and enhancement. The generated dataset is publicly available at: https://github.com/yftian2025/SynUIEDatasets.git.

A Generative Data Framework with Authentic Supervision for Underwater Image Restoration and Enhancement

TL;DR

This work tackles the data bottleneck in underwater image restoration by introducing a generative data framework that converts high-fidelity in-air images into six underwater degradation styles, creating large-scale, pixel-aligned paired datasets. It leverages unpaired image-to-image translation via CycleGAN-Turbo with a diffusion prior to generate two synthetic datasets, UWNature and UWImgNetSD, enabling robust supervised learning for color restoration and detail recovery. Across six network architectures and three independent test sets, models trained on the synthetic data consistently achieve superior color fidelity and cross-domain generalization compared to those trained on existing benchmarks. The approach offers a scalable, automated alternative to real underwater data collection, with public release of the synthetic datasets and potential extensions to vision-language integration and downstream tasks.

Abstract

Underwater image restoration and enhancement are crucial for correcting color distortion and restoring image details, thereby establishing a fundamental basis for subsequent underwater visual tasks. However, current deep learning methodologies in this area are frequently constrained by the scarcity of high-quality paired datasets. Since it is difficult to obtain pristine reference labels in underwater scenes, existing benchmarks often rely on manually selected results from enhancement algorithms, providing debatable reference images that lack globally consistent color and authentic supervision. This limits the model's capabilities in color restoration, image enhancement, and generalization. To overcome this limitation, we propose using in-air natural images as unambiguous reference targets and translating them into underwater-degraded versions, thereby constructing synthetic datasets that provide authentic supervision signals for model learning. Specifically, we establish a generative data framework based on unpaired image-to-image translation, producing a large-scale dataset that covers 6 representative underwater degradation types. The framework constructs synthetic datasets with precise ground-truth labels, which facilitate the learning of an accurate mapping from degraded underwater images to their pristine scene appearances. Extensive quantitative and qualitative experiments across 6 representative network architectures and 3 independent test sets show that models trained on our synthetic data achieve comparable or superior color restoration and generalization performance to those trained on existing benchmarks. This research provides a reliable and scalable data-driven solution for underwater image restoration and enhancement. The generated dataset is publicly available at: https://github.com/yftian2025/SynUIEDatasets.git.

Paper Structure

This paper contains 35 sections, 15 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: An overview of the proposed framework for synthesizing underwater image restoration and enhancement datasets. High-quality in-air images (source domain $\mathcal{X}$) are translated into degraded underwater images (target domain $\mathcal{Y}$) via an image-to-image translation model $G$. The resulting paired data ${(x, \hat{y})}$ is then used to train various models in underwater image restoration and enhancement, whose performance is rigorously evaluated on independent test sets. The results consistently demonstrate superior performance in color restoration and robustness.
  • Figure 2: Processed natural image samples from the source domain, sourced from the RAISEdang2015raise, ImageNetdeng2009imagenet, and iNaturalistvan2018inaturalist datasets.
  • Figure 3: The 6 representative types of underwater degradation in the target domain: Blue, Low-Light, Deep Blue, Deep Green, Green, and Blurry.
  • Figure 4: The adapted SD-Turbo architecture for Nature2Underwater translation. A LoRA fine-tuned UNet performs single-pass translation guided by a text condition. Structural details are preserved by injecting VAE encoder activations via zero-convolution layers.
  • Figure 5: Sample pairs from the synthesized underwater image restoration and enhancement dataset (UWNature). Each column corresponds to one degradation type, showing the clear reference image (bottom) and its synthesized degraded counterpart (top).
  • ...and 5 more figures