Table of Contents
Fetching ...

DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models

Seunghoo Hong, Geonho Son, Juhun Lee, Simon S. Woo

TL;DR

This paper tackles the risk that DDIM inversion enables real-image editing and potential misuse, proposing DDIM Inversion Attack (DIA) to disrupt the integrated diffusion trajectory. DIA comprises two variants, DIA-PT and DIA-R, which explicitly attack the inversion process trajectory and the reconstructed latent path using differentiable, memory-efficient trajectory optimization. Across the PIE-Bench benchmark, DIA methods outperform existing defenses, maintaining disruption across editing methods, noise budgets, and purification, while preserving perceptual content to a meaningful degree. The work offers a practical defense mechanism for industry and research to curb misuse of inversion-based editing in diffusion models, with a code release to facilitate adoption and further study.

Abstract

Diffusion models have shown to be strong representation learners, showcasing state-of-the-art performance across multiple domains. Aside from accelerated sampling, DDIM also enables the inversion of real images back to their latent codes. A direct inheriting application of this inversion operation is real image editing, where the inversion yields latent trajectories to be utilized during the synthesis of the edited image. Unfortunately, this practical tool has enabled malicious users to freely synthesize misinformative or deepfake contents with greater ease, which promotes the spread of unethical and abusive, as well as privacy-, and copyright-infringing contents. While defensive algorithms such as AdvDM and Photoguard have been shown to disrupt the diffusion process on these images, the misalignment between their objectives and the iterative denoising trajectory at test time results in weak disruptive performance.In this work, we present the DDIM Inversion Attack (DIA) that attacks the integrated DDIM trajectory path. Our results support the effective disruption, surpassing previous defensive methods across various editing methods. We believe that our frameworks and results can provide practical defense methods against the malicious use of AI for both the industry and the research community. Our code is available here: https://anonymous.4open.science/r/DIA-13419/.

DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models

TL;DR

This paper tackles the risk that DDIM inversion enables real-image editing and potential misuse, proposing DDIM Inversion Attack (DIA) to disrupt the integrated diffusion trajectory. DIA comprises two variants, DIA-PT and DIA-R, which explicitly attack the inversion process trajectory and the reconstructed latent path using differentiable, memory-efficient trajectory optimization. Across the PIE-Bench benchmark, DIA methods outperform existing defenses, maintaining disruption across editing methods, noise budgets, and purification, while preserving perceptual content to a meaningful degree. The work offers a practical defense mechanism for industry and research to curb misuse of inversion-based editing in diffusion models, with a code release to facilitate adoption and further study.

Abstract

Diffusion models have shown to be strong representation learners, showcasing state-of-the-art performance across multiple domains. Aside from accelerated sampling, DDIM also enables the inversion of real images back to their latent codes. A direct inheriting application of this inversion operation is real image editing, where the inversion yields latent trajectories to be utilized during the synthesis of the edited image. Unfortunately, this practical tool has enabled malicious users to freely synthesize misinformative or deepfake contents with greater ease, which promotes the spread of unethical and abusive, as well as privacy-, and copyright-infringing contents. While defensive algorithms such as AdvDM and Photoguard have been shown to disrupt the diffusion process on these images, the misalignment between their objectives and the iterative denoising trajectory at test time results in weak disruptive performance.In this work, we present the DDIM Inversion Attack (DIA) that attacks the integrated DDIM trajectory path. Our results support the effective disruption, surpassing previous defensive methods across various editing methods. We believe that our frameworks and results can provide practical defense methods against the malicious use of AI for both the industry and the research community. Our code is available here: https://anonymous.4open.science/r/DIA-13419/.

Paper Structure

This paper contains 36 sections, 13 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: CLIP similarity score difference between Natural Editing and disruption methods. Our methods DIA-PT/R demonstrate good semantic disruption performance across various combinations of Inversions and Edits. Lower scores equate to stronger disruption.
  • Figure 1: Quality comparison across Stable Diffusion Model Version. In this figure, DIA-PT and DIA-R visualize the results of editing images immunized in SD v1.4 across SD v1.4, SD v2.0, and SD v2.1. Editing in different versions reduces the disruptive performance, but still shows considerable effectiveness.
  • Figure 2: Overview of the DDIM Inversion Attack Framework (DIA). In this figure, all items are explained in the context of DDIM with timesteps skipping by 3. a) visualizes the DDIM Process summarized by the strength of $x_0$ and the model trajectory. Each component provides details for explaining the DIA. In b) and c), optimization of $\delta_{\text{DIA-PT}}$ and $\delta_{\text{DIA-R}}$ is shown, which are adversarial noises that interfere with obtaining $z_T$ and $x_0$ using the Differential Trajectory. Finally, d) illustrates the Differential Trajectory to be used in the DDIM process attack. Note that chaining computational graphs to compute the loss results in excessive memory consumption. Therefore, obtaining partial gradients is necessary through step-by-step DDIM inference using cloned latents.
  • Figure 2: Quality comparison of images generated by DDIM-to-P2P across different immunization methods. The words in green indicate the parts to be edited from the original image. We visualize the failure to preserve the integrity of the original image.
  • Figure 3: Quality comparison of images generated by DDIM-to-DDIM across different immunization methods. The words in green indicate the parts to be edited from the original image. Each method has a different immunization performance compared to Natural Edit. Our method, DIA-PT and DIA-R, demonstrate robust image protection performance on various images.
  • ...and 4 more figures