Disrupting Style Mimicry Attacks on Video Imagery

Josephine Passananti; Stanley Wu; Shawn Shan; Haitao Zheng; Ben Y. Zhao

Disrupting Style Mimicry Attacks on Video Imagery

Josephine Passananti, Stanley Wu, Shawn Shan, Haitao Zheng, Ben Y. Zhao

TL;DR

This work investigates the vulnerability of video imagery to style mimicry attacks and demonstrates that naive per-frame defenses fail under adaptive, temporally aware attacks. It introduces Gimbal, a scene-based protection framework that computes a universal target for each video scene and gradually refines per-frame perturbations to maintain robust protection while dramatically reducing computation and improving video quality. Across automated metrics and large-user studies, Gimbal restores protection against adaptive attacks and often improves perceptual quality over naive defenses, marking a significant advance in safeguarding video content. The findings highlight both the practicality of video-based mimicry threats and a viable, scalable defense strategy with real-world relevance for creators and platforms.

Abstract

Generative AI models are often used to perform mimicry attacks, where a pretrained model is fine-tuned on a small sample of images to learn to mimic a specific artist of interest. While researchers have introduced multiple anti-mimicry protection tools (Mist, Glaze, Anti-Dreambooth), recent evidence points to a growing trend of mimicry models using videos as sources of training data. This paper presents our experiences exploring techniques to disrupt style mimicry on video imagery. We first validate that mimicry attacks can succeed by training on individual frames extracted from videos. We show that while anti-mimicry tools can offer protection when applied to individual frames, this approach is vulnerable to an adaptive countermeasure that removes protection by exploiting randomness in optimization results of consecutive (nearly-identical) frames. We develop a new, tool-agnostic framework that segments videos into short scenes based on frame-level similarity, and use a per-scene optimization baseline to remove inter-frame randomization while reducing computational cost. We show via both image level metrics and an end-to-end user study that the resulting protection restores protection against mimicry (including the countermeasure). Finally, we develop another adaptive countermeasure and find that it falls short against our framework.

Disrupting Style Mimicry Attacks on Video Imagery

TL;DR

Abstract

Paper Structure (30 sections, 2 equations, 15 figures, 12 tables)

This paper contains 30 sections, 2 equations, 15 figures, 12 tables.

Introduction
Background and Related Work
Style Mimicry and Existing Defenses
Style Mimicry Attacks on Extracted Video Frames
Threat Model
Style Mimicry Leveraging Video Frames
A Naive Defense and Its Limitations
An Adaptive Mimicry Attack
Perturbation Removal
Experimental Setup and Metrics
Adaptive Mimicry Results
Protecting Video Imagery with Gimbal
Design intuition
System Design
Evaluation
...and 15 more sections

Figures (15)

Figure 1: Style mimicry scenario demonstrating pipeline for adversaries to finetune a diffusion model on video frames
Figure 2: Style mimicry on clean video frames successfully mimics style of original video.
Figure 3: Averaging pixel values across highly similar consecutive frames successfully degrades the randomized protection pixel shifts across frames and largely restores the original unprotected frame.
Figure 4: Visual examples of adaptive mimicry attack. Three rows of mimicry images generated by models trained on 1) original video frames, 2) video frames protected naively, and 3) video frames recovered after perturbation removal using pixel averaging.
Figure 5: Gimbal partitions videos into scenes by measuring average pixel difference. Each scene is perturbed using a two-part process: 1) A target image ($T$) is computed by averaging all frames and style transferring to a 'target style' 2) Perturbations are iteratively applied and optimized by minimizing latent $L_2$ norm between $T$ and the perturbed frame.
...and 10 more figures

Disrupting Style Mimicry Attacks on Video Imagery

TL;DR

Abstract

Disrupting Style Mimicry Attacks on Video Imagery

Authors

TL;DR

Abstract

Table of Contents

Figures (15)