Table of Contents
Fetching ...

Posterior Distillation Sampling

Juil Koo, Chanho Park, Minhyuk Sung

TL;DR

PDS matches the stochastic latents of the source and the target, enabling the sampling of targets in diverse parameter spaces that align with a desired attribute while maintaining the source's identity, in a novel optimization method for parametric image editing based on diffusion models.

Abstract

We introduce Posterior Distillation Sampling (PDS), a novel optimization method for parametric image editing based on diffusion models. Existing optimization-based methods, which leverage the powerful 2D prior of diffusion models to handle various parametric images, have mainly focused on generation. Unlike generation, editing requires a balance between conforming to the target attribute and preserving the identity of the source content. Recent 2D image editing methods have achieved this balance by leveraging the stochastic latent encoded in the generative process of diffusion models. To extend the editing capabilities of diffusion models shown in pixel space to parameter space, we reformulate the 2D image editing method into an optimization form named PDS. PDS matches the stochastic latents of the source and the target, enabling the sampling of targets in diverse parameter spaces that align with a desired attribute while maintaining the source's identity. We demonstrate that this optimization resembles running a generative process with the target attribute, but aligning this process with the trajectory of the source's generative process. Extensive editing results in Neural Radiance Fields and Scalable Vector Graphics representations demonstrate that PDS is capable of sampling targets to fulfill the aforementioned balance across various parameter spaces.

Posterior Distillation Sampling

TL;DR

PDS matches the stochastic latents of the source and the target, enabling the sampling of targets in diverse parameter spaces that align with a desired attribute while maintaining the source's identity, in a novel optimization method for parametric image editing based on diffusion models.

Abstract

We introduce Posterior Distillation Sampling (PDS), a novel optimization method for parametric image editing based on diffusion models. Existing optimization-based methods, which leverage the powerful 2D prior of diffusion models to handle various parametric images, have mainly focused on generation. Unlike generation, editing requires a balance between conforming to the target attribute and preserving the identity of the source content. Recent 2D image editing methods have achieved this balance by leveraging the stochastic latent encoded in the generative process of diffusion models. To extend the editing capabilities of diffusion models shown in pixel space to parameter space, we reformulate the 2D image editing method into an optimization form named PDS. PDS matches the stochastic latents of the source and the target, enabling the sampling of targets in diverse parameter spaces that align with a desired attribute while maintaining the source's identity. We demonstrate that this optimization resembles running a generative process with the target attribute, but aligning this process with the trajectory of the source's generative process. Extensive editing results in Neural Radiance Fields and Scalable Vector Graphics representations demonstrate that PDS is capable of sampling targets to fulfill the aforementioned balance across various parameter spaces.
Paper Structure (30 sections, 22 equations, 8 figures, 2 tables)

This paper contains 30 sections, 22 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: A comparison of 3D scene editing between PDS and other baselines. Given input 3D scenes on the left, PDS, marked by green boxes on the rightmost side, successfully performs complex editing, such as geometric changes and adding objects, according to the input texts. On the other hand, the baselines either fail to change the input 3D scenes or produce results that greatly deviate from the input scenes, losing their identity.
  • Figure 2: A visual comparison of the editing process through SDS Poole:2023DreamFusion, DDS Hertz:2023DDS and PDS. The figure illustrates the trajectories of samples drawn from $p(\mathbf{x}_0 | y=1)$ as they are shifted towards $p(\mathbf{x}_0 | y=2)$. PDS notably moves the samples near the boundary of the two marginals---the optimal endpoint in that it balances the necessary change with the original identity.
  • Figure 3: An example of editing inducing large variations across different views. The figure shows NeRF editing results of ours and Iterative DU methods, IN2N Haque:2023InstructNeRF and Inv2N, with their corresponding 2D editing results obtained by IP2P Brooks:2023InstructPix2Pix and DDPM Inversion Huberman:2023FriendlyInversion, respectively. When 2D editing leads to large variations, the Iterative DU methods fail to produce accurate edits in 3D space.
  • Figure 4: A qualitative comparison of SVG editing using three different optimization methods: SDS Poole:2023DreamFusion, DDS Hertz:2023DDS and PDS. PDS makes changes according to input text while most preserving the structural semantics of the input SVGs.
  • Figure A5: Editing of more diverse representations, 3D Gaussian Splats Kerbl:20233DGS and 2D images. PDS consistently outperforms the baselines. The target attributes are "Batman" and "raising the arms."
  • ...and 3 more figures