Table of Contents
Fetching ...

Uncertainty Estimation for Super-Resolution using ESRGAN

Maniraj Sai Adapa, Marco Zullich, Matias Valdenegro-Toro

TL;DR

This work addresses the lack of principled predictive uncertainty in deep SR by integrating Monte Carlo Dropout and Deep Ensembles with state-of-the-art GAN-based models (SRGAN/ESRGAN) to produce per-pixel uncertainty maps alongside SR outputs. The authors demonstrate that uncertainty estimates are decently calibrated and do not degrade SR performance, with ensembles often achieving the best PSNR and reliable error–uncertainty correlations. The approach provides practical value by highlighting uncertain regions where SR may be inaccurate, potentially guiding human users and downstream systems. This enhances the interpretability and responsible deployment of SR in real-world applications where out-of-distribution inputs or fine textures pose challenges.

Abstract

Deep Learning-based image super-resolution (SR) has been gaining traction with the aid of Generative Adversarial Networks. Models like SRGAN and ESRGAN are constantly ranked between the best image SR tools. However, they lack principled ways for estimating predictive uncertainty. In the present work, we enhance these models using Monte Carlo-Dropout and Deep Ensemble, allowing the computation of predictive uncertainty. When coupled with a prediction, uncertainty estimates can provide more information to the model users, highlighting pixels where the SR output might be uncertain, hence potentially inaccurate, if these estimates were to be reliable. Our findings suggest that these uncertainty estimates are decently calibrated and can hence fulfill this goal, while providing no performance drop with respect to the corresponding models without uncertainty estimation.

Uncertainty Estimation for Super-Resolution using ESRGAN

TL;DR

This work addresses the lack of principled predictive uncertainty in deep SR by integrating Monte Carlo Dropout and Deep Ensembles with state-of-the-art GAN-based models (SRGAN/ESRGAN) to produce per-pixel uncertainty maps alongside SR outputs. The authors demonstrate that uncertainty estimates are decently calibrated and do not degrade SR performance, with ensembles often achieving the best PSNR and reliable error–uncertainty correlations. The approach provides practical value by highlighting uncertain regions where SR may be inaccurate, potentially guiding human users and downstream systems. This enhances the interpretability and responsible deployment of SR in real-world applications where out-of-distribution inputs or fine textures pose challenges.

Abstract

Deep Learning-based image super-resolution (SR) has been gaining traction with the aid of Generative Adversarial Networks. Models like SRGAN and ESRGAN are constantly ranked between the best image SR tools. However, they lack principled ways for estimating predictive uncertainty. In the present work, we enhance these models using Monte Carlo-Dropout and Deep Ensemble, allowing the computation of predictive uncertainty. When coupled with a prediction, uncertainty estimates can provide more information to the model users, highlighting pixels where the SR output might be uncertain, hence potentially inaccurate, if these estimates were to be reliable. Our findings suggest that these uncertainty estimates are decently calibrated and can hence fulfill this goal, while providing no performance drop with respect to the corresponding models without uncertainty estimation.

Paper Structure

This paper contains 23 sections, 7 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Ensemble Results using ESRGAN with Uncertainty, including error vs standard deviation plots. This figure shows how SR uncertainty correlates with SR reconstruction errors and can be used to detect possible errors at inference time. Error vs Std plots show that uncertainty correlates very well with absolute errors at the pixel level.
  • Figure 2: Visual comparison of SRGAN vs ESRGAN without uncertainty estimation. "HR" indicates the $256\times 256$ high-resolution crop which is used as ground truth. ESRGAN looks qualitatively much more impressive than SRGAN: the latter's output is very blurry and seems unable to reconstruct fine-grained details. Conversely, the former is perceptually much closer to the original image and displays generally fewer artifacts.
  • Figure 3: Comparison of SR and its uncertainty between Ensembles and MC-Dropout.
  • Figure 4: Visual comparison of Super-Resolution output and Uncertainty maps for baboon.png in Set14.
  • Figure 5: Visual comparison of Super-Resolution output and Uncertainty maps for COCO Tennis image.
  • ...and 1 more figures