Table of Contents
Fetching ...

Osmosis: RGBD Diffusion Prior for Underwater Image Restoration

Opher Bar Nathan, Deborah Levy, Tali Treibitz, Dan Rosenbaum

TL;DR

Underwater image restoration is highly ill-posed due to wavelength-dependent attenuation and backscatter, compounded by a lack of clean ground-truth data. We propose Osmosis, which learns an unconditional RGBD diffusion prior from in-air outdoor RGBD data and performs posterior sampling guided by the underwater image formation model to recover both the clean scene $J$ and depth $D$ from a single underwater image, while estimating water parameters. The key contributions are (1) an RGBD diffusion prior trained on in-air data, (2) a diffusion-guided posterior framework that jointly recovers color, depth, and water parameters, and (3) state-of-the-art restoration results on real and simulated underwater scenes with public code and data released. The approach demonstrates robust depth-aware restoration without underwater training data, enabling more reliable underwater analysis across scenes and conditions.

Abstract

Underwater image restoration is a challenging task because of water effects that increase dramatically with distance. This is worsened by lack of ground truth data of clean scenes without water. Diffusion priors have emerged as strong image restoration priors. However, they are often trained with a dataset of the desired restored output, which is not available in our case. We also observe that using only color data is insufficient, and therefore augment the prior with a depth channel. We train an unconditional diffusion model prior on the joint space of color and depth, using standard RGBD datasets of natural outdoor scenes in air. Using this prior together with a novel guidance method based on the underwater image formation model, we generate posterior samples of clean images, removing the water effects. Even though our prior did not see any underwater images during training, our method outperforms state-of-the-art baselines for image restoration on very challenging scenes. Our code, models and data are available on the project website.

Osmosis: RGBD Diffusion Prior for Underwater Image Restoration

TL;DR

Underwater image restoration is highly ill-posed due to wavelength-dependent attenuation and backscatter, compounded by a lack of clean ground-truth data. We propose Osmosis, which learns an unconditional RGBD diffusion prior from in-air outdoor RGBD data and performs posterior sampling guided by the underwater image formation model to recover both the clean scene and depth from a single underwater image, while estimating water parameters. The key contributions are (1) an RGBD diffusion prior trained on in-air data, (2) a diffusion-guided posterior framework that jointly recovers color, depth, and water parameters, and (3) state-of-the-art restoration results on real and simulated underwater scenes with public code and data released. The approach demonstrates robust depth-aware restoration without underwater training data, enabling more reliable underwater analysis across scenes and conditions.

Abstract

Underwater image restoration is a challenging task because of water effects that increase dramatically with distance. This is worsened by lack of ground truth data of clean scenes without water. Diffusion priors have emerged as strong image restoration priors. However, they are often trained with a dataset of the desired restored output, which is not available in our case. We also observe that using only color data is insufficient, and therefore augment the prior with a depth channel. We train an unconditional diffusion model prior on the joint space of color and depth, using standard RGBD datasets of natural outdoor scenes in air. Using this prior together with a novel guidance method based on the underwater image formation model, we generate posterior samples of clean images, removing the water effects. Even though our prior did not see any underwater images during training, our method outperforms state-of-the-art baselines for image restoration on very challenging scenes. Our code, models and data are available on the project website.
Paper Structure (28 sections, 11 equations, 20 figures, 2 tables)

This paper contains 28 sections, 11 equations, 20 figures, 2 tables.

Figures (20)

  • Figure 1: Our method receives as an input a single underwater image and outputs the restored clean image and an estimated depth map. The output is estimated using a diffusion prior trained on RGBD images and the physical image formation model.
  • Figure 2: The iterative sampling process starts in $t=T$ with random noise in $4$ channels. The denoising step outputs denoised samples $\hat{x}_0= (\hat{J}_0, \hat{D}_0)$. We use the underwater physical image formation model together with $\hat{x}_0$ to optimize the water parameters $\hat{\phi}$, and to guide the sampling towards the observed image. This process repeats itself, gradually updating both the estimated image and depth, until $t=0$, in which $x_0$ holds the method's estimate for both the reconstructed scene $J_0$ and its depth $D_0$.
  • Figure 3: [Left] Example images from outdoor RGBD datasets used for training our prior. From left to right: DIODE diode_dataset, ReDWeb-S liu2021learning, HR-WSI Xian_2020_CVPR, KITTI Geiger2013IJRR. [Right] Samples from the trained RGBD prior. The samples demonstrate the inherent correlation between RGB image and depth in our trained RGBD prior.
  • Figure 4: Our algorithm. [Left] Detailed steps of our algorithm. [Right] Example of how $\hat{J}_0, \hat{D}_0$ change during the iterations.
  • Figure 5: Real-world restoration results. From left to right: white-balanced input, contrast stretch, GDCP peng2018generalization, Ucolor li2021underwater, CWR han2022underwater, semi-UIR huang2023contrastive, DM tang2023underwater, Depth Anything yang2024depth - Osmosis, Osmosis (ours). Zoom-in colored rectangles emphasize far objects that have higher contrast in our results. Real-world depth results. From left to right: GDCP peng2018generalization, IBLA peng2017underwater, unveiling bekerman2020unveiling, UW-Net gupta2019unsupervised, monoUWnet amitai2023self, Depth Anything yang2024depth, Osmosis (ours). Our depth results are smoother and less affected by object gradients. The reader is encouraged to zoom-in.
  • ...and 15 more figures