Table of Contents
Fetching ...

Image Super-Resolution with Guarantees via Conformalized Generative Models

Eduardo Adame, Daniel Csillag, Guilherme Tegoni Goedert

TL;DR

The paper tackles trustworthy uncertainty quantification in diffusion-based image super-resolution by introducing a conformal-prediction framework that yields a trust mask M_α(X) calibrated on unlabeled high-resolution data. It defines a model-indecision map σ to generate score masks and supports multiple fidelity metrics D_p, including pointwise, neighborhood-averaged, and semantic variants, all with probabilistic guarantees. The method provably controls fidelity error and PSNR, is robust to data leakage, and remains model-agnostic, applicable even to APIs, with efficient calibration. Empirical results on LIU4K with SinSR demonstrate accurate, interpretable confidence regions and favorable comparisons to prior uncertainty methods, suggesting practical utility for deploying high-resolution generative models in real-world settings.

Abstract

The increasing use of generative ML foundation models for image restoration tasks such as super-resolution calls for robust and interpretable uncertainty quantification methods. We address this need by presenting a novel approach based on conformal prediction techniques to create a 'confidence mask' capable of reliably and intuitively communicating where the generated image can be trusted. Our method is adaptable to any black-box generative model, including those locked behind an opaque API, requires only easily attainable data for calibration, and is highly customizable via the choice of a local image similarity metric. We prove strong theoretical guarantees for our method that span fidelity error control (according to our local image similarity metric), reconstruction quality, and robustness in the face of data leakage. Finally, we empirically evaluate these results and establish our method's solid performance.

Image Super-Resolution with Guarantees via Conformalized Generative Models

TL;DR

The paper tackles trustworthy uncertainty quantification in diffusion-based image super-resolution by introducing a conformal-prediction framework that yields a trust mask M_α(X) calibrated on unlabeled high-resolution data. It defines a model-indecision map σ to generate score masks and supports multiple fidelity metrics D_p, including pointwise, neighborhood-averaged, and semantic variants, all with probabilistic guarantees. The method provably controls fidelity error and PSNR, is robust to data leakage, and remains model-agnostic, applicable even to APIs, with efficient calibration. Empirical results on LIU4K with SinSR demonstrate accurate, interpretable confidence regions and favorable comparisons to prior uncertainty methods, suggesting practical utility for deploying high-resolution generative models in real-world settings.

Abstract

The increasing use of generative ML foundation models for image restoration tasks such as super-resolution calls for robust and interpretable uncertainty quantification methods. We address this need by presenting a novel approach based on conformal prediction techniques to create a 'confidence mask' capable of reliably and intuitively communicating where the generated image can be trusted. Our method is adaptable to any black-box generative model, including those locked behind an opaque API, requires only easily attainable data for calibration, and is highly customizable via the choice of a local image similarity metric. We prove strong theoretical guarantees for our method that span fidelity error control (according to our local image similarity metric), reconstruction quality, and robustness in the face of data leakage. Finally, we empirically evaluate these results and establish our method's solid performance.

Paper Structure

This paper contains 24 sections, 3 theorems, 20 equations, 7 figures, 2 tables, 2 algorithms.

Key Result

Theorem 2.1

Let $\alpha \in \mathbb{R}$ and $n \in \mathbb{N}$. Suppose we have $n+1$ i.i.d.Technically, Theorem thm:conformal-guarantee holds under the weaker assumption of exchangeability, with the same proof. We stick to i.i.d. for simplicity. samples $(X_i, Y_i)_{i=1}^{n+1}$ from an arbitrary probability di

Figures (7)

  • Figure 1: Our method highlights meaningful uncertainty regions in generated images. This figure presents a comparison of multiple high-resolution images with their corresponding conformal masks. Our conformal masks accurately highlight regions where the predictions significantly deviate from the ground truth, capturing differences in color, texture, and lighting.
  • Figure 2: Our method provably controls the fidelity error with accuracy.Left: The figure shows the non-semantic fidelity error obtained by our method for varying fidelity levels $\alpha$, for calibration with both the semantic $D_p$ (orange) and non-semantic $D_p$ (blue). As is shown in our theoretical guarantee, the error is tightly controlled by our method, being at most $\alpha$. Center and Right: The plot displays the size of our confidence masks (i.e., how much of the image we do not trust) for varying fidelity levels $\alpha$, for calibration done with a semantic $D_p$ (Center) and a non-semantic $D_p$ (Right). As $\alpha$ increases our masks get smaller, eventually reaching zero, i.e., trusting the whole image.
  • Figure 3: Unreliable predictions are accurately detected. This figure shows a close-up example where $D_p$ is non-semantic (as shown in Figure \ref{['fig:lr-sr-masks']}), highlighting a failure of the base model to reconstruct a blurred area. Our method correctly identifies this failure and assigns low confidence to the affected region, where the predicted image deviates from the ground truth.
  • Figure 4: Our controls the PSNR and is robust and under data leakage.Left: This experiment confirms that the PSNR is theoretically bounded, as established in Proposition \ref{['thm:psnr']}. Additionally, the PSNR within the conformal masks—in both semantic and non-semantic settings—remains consistently higher than the baseline, indicating improved prediction quality in trusted regions. Right: This plot illustrates that the fidelity error can bounded under data leakage, as per Proposition \ref{['thm:data-leakage']}. In both plots, the values plateau once the method reaches the point of trusting the whole images.
  • Figure 5: Qualitative comparison to prior work. This figure compares our method -- both semantic and non-semantic $D_p$, under the same settings as Figure \ref{['fig:lr-sr-masks']} -- against the methods of prev-interval-2 and prev-mask-noguarantee. While our conformal masks highlight precise regions of uncertainty in an interpretable way, the method from prev-mask-noguarantee produces continuous masks that closely mirror the original image rather than doing proper uncertainty estimation. Similarly, the heatmaps from prev-interval-2 do not visually convey uncertainty, making their interpretation more challenging.
  • ...and 2 more figures

Theorems & Definitions (7)

  • Theorem 2.1: Marginal conformal fidelity guarantee
  • Proposition 3.1
  • Proposition 3.2
  • proof : Proof of Theorem \ref{['thm:conformal-guarantee']}
  • proof : Proof of Proposition \ref{['thm:psnr']}
  • proof : Proof of \ref{['thm:data-leakage']}
  • Example B.1