Disrupting Diffusion-based Inpainters with Semantic Digression

Geonho Son; Juhun Lee; Simon S. Woo

Disrupting Diffusion-based Inpainters with Semantic Digression

Geonho Son, Juhun Lee, Simon S. Woo

TL;DR

This work addresses the rising threat of malicious edits using diffusion-based inpainters by introducing DDD, a disruption framework that performs semantic digression to immunize context images. Rather than brute-forcing through the full diffusion chain, DDD targets vulnerable early timesteps and optimizes a timestep-agnostic hidden-space loss around a multimodal semantic centroid, refined via token-projected text optimization. The key contributions are: (1) identifying a vulnerable timestep range, (2) formulating a timestep-free loss in hidden space, (3) defining a semantic centroid via Monte Carlo sampling, and (4) enabling stable, discrete token projection for text conditioning. Empirically, DDD outperforms the prior state-of-the-art Photoguard across quantitative disruption metrics and qualitative evaluations, while delivering roughly threefold reductions in compute time and memory usage, thereby democratizing image protection against unconsented edits with practical hardware requirements.

Abstract

The fabrication of visual misinformation on the web and social media has increased exponentially with the advent of foundational text-to-image diffusion models. Namely, Stable Diffusion inpainters allow the synthesis of maliciously inpainted images of personal and private figures, and copyrighted contents, also known as deepfakes. To combat such generations, a disruption framework, namely Photoguard, has been proposed, where it adds adversarial noise to the context image to disrupt their inpainting synthesis. While their framework suggested a diffusion-friendly approach, the disruption is not sufficiently strong and it requires a significant amount of GPU and time to immunize the context image. In our work, we re-examine both the minimal and favorable conditions for a successful inpainting disruption, proposing DDD, a "Digression guided Diffusion Disruption" framework. First, we identify the most adversarially vulnerable diffusion timestep range with respect to the hidden space. Within this scope of noised manifold, we pose the problem as a semantic digression optimization. We maximize the distance between the inpainting instance's hidden states and a semantic-aware hidden state centroid, calibrated both by Monte Carlo sampling of hidden states and a discretely projected optimization in the token space. Effectively, our approach achieves stronger disruption and a higher success rate than Photoguard while lowering the GPU memory requirement, and speeding the optimization up to three times faster.

Disrupting Diffusion-based Inpainters with Semantic Digression

TL;DR

Abstract

Paper Structure (26 sections, 11 equations, 13 figures, 2 tables)

This paper contains 26 sections, 11 equations, 13 figures, 2 tables.

Introduction
Related Works
Background
Adversarial Attacks
Diffusion Models
Stable Diffusion Inpainters
Method
Search for the Vulnerable Timestep
Timestep Constraint-Free Loss Function
Aligning the Objective with Disruption's Goal
Token Projective Textual Optimization
Experimental Results
Technical Details
Quantitative Results
Qualitative Results
...and 11 more sections

Figures (13)

Figure 1: Our framework DDD optimizes adversarial perturbations for images that protect malicious users from editing the images without consent. Through various test images, our efforts demonstrate that our approach is sufficient to cover copyrighted images, pornographic abuse, and public figure editing scenarios. Ultimately, it digresses the representation of the context image away from its multimodal nucleus in expectation.
Figure 2: Overview of DDD's Framework: Our framework's objective lies in finding the context image's representative multi-modal centroid, for which our immunized image's representation semantically digresses away from it. (a) illustrates the pipeline for sampling all of the hidden states utilized in the framework. In (b), we first utilize our context image to yield the diffusion-based inpainting loss to update the token embedding $\pi^\ast$. Finally, with $\tau^\ast = \mathcal{E}(\pi^\ast)$, we construct the multi-modal centroid via Monte Carlo sampling.
Figure 3: A side-by-side comparison of Photoguard against DDD. DDD disrupts the hidden representations, which leaves more clearly visible unnatural artifacts and disruption across the images.
Figure 4: Human evaluation on 20 random disrupted examples.
Figure 5: An ablation study for various targeted and untargeted scenarios. In this context, "S" and "T" represent the text conditions of source and target, respectively. Furthermore, the down arrow ($\color{blue}\downarrow$) signifies a targeted scenario, where the objective is to move closer to the fixed $H^{target}$, while the up arrow ($\color{blue}\uparrow$) represents an untargeted scenario, where the objective is to move away from the fixed $H^{target}$. And, (e) represents the result from our work, DDD.
...and 8 more figures

Disrupting Diffusion-based Inpainters with Semantic Digression

TL;DR

Abstract

Disrupting Diffusion-based Inpainters with Semantic Digression

Authors

TL;DR

Abstract

Table of Contents

Figures (13)