Table of Contents
Fetching ...

CODE: Confident Ordinary Differential Editing

Bastien van Delft, Tommaso Martorella, Alexandre Alahi

TL;DR

CODE addresses the challenge of conditioning diffusion models on Out-of-Distribution guidance images by leveraging a probability-flow Ordinary Differential Equation as a generative prior, with Langevin corrections in latent space and a confidence-based clipping mechanism. It offers a fully blind restoration workflow that requires no training on corrupted data, no paired guidance, and compatibility with any pre-trained diffusion model. The main contributions are an ODE-based editing paradigm that decouples noise injection from inversion depth and a confidence-based clipping strategy that increases robustness to unknown degradations. Empirical results show CODE outperforms SDEdit in realism and fidelity across diverse severe degradations, highlighting its potential for flexible, unsupervised image restoration and editing in real-world OoD scenarios.

Abstract

Conditioning image generation facilitates seamless editing and the creation of photorealistic images. However, conditioning on noisy or Out-of-Distribution (OoD) images poses significant challenges, particularly in balancing fidelity to the input and realism of the output. We introduce Confident Ordinary Differential Editing (CODE), a novel approach for image synthesis that effectively handles OoD guidance images. Utilizing a diffusion model as a generative prior, CODE enhances images through score-based updates along the probability-flow Ordinary Differential Equation (ODE) trajectory. This method requires no task-specific training, no handcrafted modules, and no assumptions regarding the corruptions affecting the conditioning image. Our method is compatible with any diffusion model. Positioned at the intersection of conditional image generation and blind image restoration, CODE operates in a fully blind manner, relying solely on a pre-trained generative model. Our method introduces an alternative approach to blind restoration: instead of targeting a specific ground truth image based on assumptions about the underlying corruption, CODE aims to increase the likelihood of the input image while maintaining fidelity. This results in the most probable in-distribution image around the input. Our contributions are twofold. First, CODE introduces a novel editing method based on ODE, providing enhanced control, realism, and fidelity compared to its SDE-based counterpart. Second, we introduce a confidence interval-based clipping method, which improves CODE's effectiveness by allowing it to disregard certain pixels or information, thus enhancing the restoration process in a blind manner. Experimental results demonstrate CODE's effectiveness over existing methods, particularly in scenarios involving severe degradation or OoD inputs.

CODE: Confident Ordinary Differential Editing

TL;DR

CODE addresses the challenge of conditioning diffusion models on Out-of-Distribution guidance images by leveraging a probability-flow Ordinary Differential Equation as a generative prior, with Langevin corrections in latent space and a confidence-based clipping mechanism. It offers a fully blind restoration workflow that requires no training on corrupted data, no paired guidance, and compatibility with any pre-trained diffusion model. The main contributions are an ODE-based editing paradigm that decouples noise injection from inversion depth and a confidence-based clipping strategy that increases robustness to unknown degradations. Empirical results show CODE outperforms SDEdit in realism and fidelity across diverse severe degradations, highlighting its potential for flexible, unsupervised image restoration and editing in real-world OoD scenarios.

Abstract

Conditioning image generation facilitates seamless editing and the creation of photorealistic images. However, conditioning on noisy or Out-of-Distribution (OoD) images poses significant challenges, particularly in balancing fidelity to the input and realism of the output. We introduce Confident Ordinary Differential Editing (CODE), a novel approach for image synthesis that effectively handles OoD guidance images. Utilizing a diffusion model as a generative prior, CODE enhances images through score-based updates along the probability-flow Ordinary Differential Equation (ODE) trajectory. This method requires no task-specific training, no handcrafted modules, and no assumptions regarding the corruptions affecting the conditioning image. Our method is compatible with any diffusion model. Positioned at the intersection of conditional image generation and blind image restoration, CODE operates in a fully blind manner, relying solely on a pre-trained generative model. Our method introduces an alternative approach to blind restoration: instead of targeting a specific ground truth image based on assumptions about the underlying corruption, CODE aims to increase the likelihood of the input image while maintaining fidelity. This results in the most probable in-distribution image around the input. Our contributions are twofold. First, CODE introduces a novel editing method based on ODE, providing enhanced control, realism, and fidelity compared to its SDE-based counterpart. Second, we introduce a confidence interval-based clipping method, which improves CODE's effectiveness by allowing it to disregard certain pixels or information, thus enhancing the restoration process in a blind manner. Experimental results demonstrate CODE's effectiveness over existing methods, particularly in scenarios involving severe degradation or OoD inputs.
Paper Structure (88 sections, 1 theorem, 16 equations, 59 figures, 51 tables, 4 algorithms)

This paper contains 88 sections, 1 theorem, 16 equations, 59 figures, 51 tables, 4 algorithms.

Key Result

Proposition 1

Let $\Phi$ be the cumulative distribution function of $\mathcal{N}(0, \mathcal{I})$ and let $x_0 \in [-1, 1]$. For $\alpha_t \in [0,1] \text{, } \forall t \in [0,1]$, assume that $x_t \sim \mathcal{N}(\sqrt{\alpha_t} \cdot \alpha_0, \sqrt{1-\alpha_t} \cdot \mathcal{I})$. Then, for all $\eta$: Specifically, for $\eta = 2$:

Figures (59)

  • Figure 1: CODE: a conditional image generation framework for robust Out-of-Distribution image guidance.
  • Figure 2: Editing corrupted images with ODE. The green contour plot represents the distribution of images. Given a corrupted image, we encode it into a latent space using the probability-flow ODE and our Confidence-Based Clipping. We use Langevin Dynamics in the latent space to correct the encoded image. Finally, we project the updated latent back into the visual domain.
  • Figure 3: Visual comparison on CelebAHQ with various corruption types. CODE is the only method performing on all corruptions types, significantly improving over SDEdit on two complex corruptions, Fog and Contrast. Other baselines demonstrate lower versatility while requiring extra training.
  • Figure 4: Visual comparison of general image restoration on various corruptions - LSUN
  • Figure 5: Comparison of realism-fidelity trade-off between SDEdit and CODE. Polynomial regression curves with shaded areas show one standard deviation. CODE produces more realistic images at the same fidelity. Both methods converge when the input distance is large, as they use the same pre-trained model.
  • ...and 54 more figures

Theorems & Definitions (2)

  • Proposition 1
  • proof