Table of Contents
Fetching ...

DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation

Jeongsol Kim, Geon Yeong Park, Jong Chul Ye

TL;DR

DreamSampler unifies reverse diffusion and score distillation through regularized latent optimization, enabling model-agnostic image manipulation within latent diffusion models. By reinterpretating DDIM as a proximal update and linking the resulting objective to score-distillation losses, the framework jointly supports distillation-based editing and stochastic reverse sampling. The approach accommodates external generators and regularization terms to tackle inverse problems, image restoration, editing, vectorization, and even 3D representations, with demonstrated gains over strong baselines in SVG restoration, real image editing, and text-guided inpainting. This unified design expands the design space for diffusion-based editing and offers practical tools for diverse, text-guided image and 3D tasks, with open-source code available.

Abstract

Reverse sampling and score-distillation have emerged as main workhorses in recent years for image manipulation using latent diffusion models (LDMs). While reverse diffusion sampling often requires adjustments of LDM architecture or feature engineering, score distillation offers a simple yet powerful model-agnostic approach, but it is often prone to mode-collapsing. To address these limitations and leverage the strengths of both approaches, here we introduce a novel framework called {\em DreamSampler}, which seamlessly integrates these two distinct approaches through the lens of regularized latent optimization. Similar to score-distillation, DreamSampler is a model-agnostic approach applicable to any LDM architecture, but it allows both distillation and reverse sampling with additional guidance for image editing and reconstruction. Through experiments involving image editing, SVG reconstruction and etc, we demonstrate the competitive performance of DreamSampler compared to existing approaches, while providing new applications. Code: https://github.com/DreamSampler/dream-sampler

DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation

TL;DR

DreamSampler unifies reverse diffusion and score distillation through regularized latent optimization, enabling model-agnostic image manipulation within latent diffusion models. By reinterpretating DDIM as a proximal update and linking the resulting objective to score-distillation losses, the framework jointly supports distillation-based editing and stochastic reverse sampling. The approach accommodates external generators and regularization terms to tackle inverse problems, image restoration, editing, vectorization, and even 3D representations, with demonstrated gains over strong baselines in SVG restoration, real image editing, and text-guided inpainting. This unified design expands the design space for diffusion-based editing and offers practical tools for diverse, text-guided image and 3D tasks, with open-source code available.

Abstract

Reverse sampling and score-distillation have emerged as main workhorses in recent years for image manipulation using latent diffusion models (LDMs). While reverse diffusion sampling often requires adjustments of LDM architecture or feature engineering, score distillation offers a simple yet powerful model-agnostic approach, but it is often prone to mode-collapsing. To address these limitations and leverage the strengths of both approaches, here we introduce a novel framework called {\em DreamSampler}, which seamlessly integrates these two distinct approaches through the lens of regularized latent optimization. Similar to score-distillation, DreamSampler is a model-agnostic approach applicable to any LDM architecture, but it allows both distillation and reverse sampling with additional guidance for image editing and reconstruction. Through experiments involving image editing, SVG reconstruction and etc, we demonstrate the competitive performance of DreamSampler compared to existing approaches, while providing new applications. Code: https://github.com/DreamSampler/dream-sampler
Paper Structure (28 sections, 1 theorem, 34 equations, 18 figures, 2 tables, 4 algorithms)

This paper contains 28 sections, 1 theorem, 34 equations, 18 figures, 2 tables, 4 algorithms.

Key Result

theorem thmcountertheorem

Supposed $c_{src}$ in (eqn:onestep_dds) be defined as the null-text, i.e. $c_{src}=c_\varnothing$ and consider text-conditioned posterior mean: Then, DDS update in (eqn:onestep_dds) can be obtained from the following latent optimization: Furthermore, it is equivalent to Tweedie's formula with CFG, i.e.: where $\boldsymbol{\epsilon}_\theta^\gamma({\boldsymbol z}_t, t, c_{tgt})=\boldsymbol{\epsil

Figures (18)

  • Figure 1: DreamSampler can be used for vectorized image restoration, editing, text-guided inpainting, etc. Code: https://github.com/DreamSampler/dream-sampler
  • Figure 2: DreamSampler vs (a) reverse diffusion and (b) score distillation.
  • Figure 3: Unified framework of DreamSampler. (a) Distillation step where the gradient is computed from regularized latent optimization problem. (b) Reverse sampling step where estimated noise by diffusion model is added to the updated generation.
  • Figure 4: Representative results for image vectorization task with image reconstruction.
  • Figure 5: Qualitative comparison of SVG reconstruction. For baselines, we first obtain an initial reconstruction using PSLD rout2023solving, vectorize it with LIVE ma2022towards, and refine the output vector with VectorFusion jain2023vectorfusion or PDS koo2024posterior. DreamSampler outperforms this multi-step approach by simultaneously solving the inverse problem and updating SVG parameters via score distillation.
  • ...and 13 more figures

Theorems & Definitions (1)

  • theorem thmcountertheorem