Table of Contents
Fetching ...

Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation

Yuhao He, Jinyu Tian, Haiwei Wu, Jianqing Li

TL;DR

This work addresses the risk of malicious diffusion-model inpainting by proposing Structure Disruption Attack (SDA), a region-specific defense that perturbs self-attention during the initial denoising step to disrupt contour formation and prevent coherent image synthesis in protected regions. SDA capitalizes on the coarse-to-fine generation of diffusion models, delivering a computationally efficient defense that avoids full-chain optimization by focusing on the early denoising phase and using a targeted objective on self-attention queries. Empirical results across face and instance datasets show SDA achieves state-of-the-art protection relative to existing methods, with strong robustness to data augmentations, model versions, and mask variations, suggesting practical applicability for safeguarding user images against unauthorized edits. The work highlights the pivotal role of self-attention in diffusion-based generation and offers a concrete, scalable defense with broad implications for safer AI-enabled image editing.

Abstract

The rapid advancement of diffusion models has enhanced their image inpainting and editing capabilities but also introduced significant societal risks. Adversaries can exploit user images from social media to generate misleading or harmful content. While adversarial perturbations can disrupt inpainting, global perturbation-based methods fail in mask-guided editing tasks due to spatial constraints. To address these challenges, we propose Structure Disruption Attack (SDA), a powerful protection framework for safeguarding sensitive image regions against inpainting-based editing. Building upon the contour-focused nature of self-attention mechanisms of diffusion models, SDA optimizes perturbations by disrupting queries in self-attention during the initial denoising step to destroy the contour generation process. This targeted interference directly disrupts the structural generation capability of diffusion models, effectively preventing them from producing coherent images. We validate our motivation through visualization techniques and extensive experiments on public datasets, demonstrating that SDA achieves state-of-the-art (SOTA) protection performance while maintaining strong robustness.

Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation

TL;DR

This work addresses the risk of malicious diffusion-model inpainting by proposing Structure Disruption Attack (SDA), a region-specific defense that perturbs self-attention during the initial denoising step to disrupt contour formation and prevent coherent image synthesis in protected regions. SDA capitalizes on the coarse-to-fine generation of diffusion models, delivering a computationally efficient defense that avoids full-chain optimization by focusing on the early denoising phase and using a targeted objective on self-attention queries. Empirical results across face and instance datasets show SDA achieves state-of-the-art protection relative to existing methods, with strong robustness to data augmentations, model versions, and mask variations, suggesting practical applicability for safeguarding user images against unauthorized edits. The work highlights the pivotal role of self-attention in diffusion-based generation and offers a concrete, scalable defense with broad implications for safer AI-enabled image editing.

Abstract

The rapid advancement of diffusion models has enhanced their image inpainting and editing capabilities but also introduced significant societal risks. Adversaries can exploit user images from social media to generate misleading or harmful content. While adversarial perturbations can disrupt inpainting, global perturbation-based methods fail in mask-guided editing tasks due to spatial constraints. To address these challenges, we propose Structure Disruption Attack (SDA), a powerful protection framework for safeguarding sensitive image regions against inpainting-based editing. Building upon the contour-focused nature of self-attention mechanisms of diffusion models, SDA optimizes perturbations by disrupting queries in self-attention during the initial denoising step to destroy the contour generation process. This targeted interference directly disrupts the structural generation capability of diffusion models, effectively preventing them from producing coherent images. We validate our motivation through visualization techniques and extensive experiments on public datasets, demonstrating that SDA achieves state-of-the-art (SOTA) protection performance while maintaining strong robustness.

Paper Structure

This paper contains 27 sections, 2 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Protected vs. Unprotected Image Resistance: (Top) Malicious inpainting alters contextual elements (e.g. naked and the White House) while preserving key features (e.g. human face). (Bottom) SDA-protected images demonstrate robust resistance to such editings through sensitive region encryption, effectively neutralizing unauthorized edits.
  • Figure 2: Denoising process of inpainting diffusion models. We visualize intermediate denoising process outputs and use red curves to mark the facial contours during the denoising process.
  • Figure 3: The attention map during the denoising process of Inpainting
  • Figure 4: The inpainting diffusion pipeline and the protective perturbation update process. The black path is the inpainting denoising process, and the red path is the perturbation update process.
  • Figure 5: Comparison of attention maps during the generation process between original and protected images. Above the dashed line: generation process of the original image, where the first row shows self-attention maps and the second row displays cross-attention maps for the "dog" token. Below the dashed line: generation process of the image protected by SDA.
  • ...and 5 more figures