Table of Contents
Fetching ...

Video Deepfake Abuse: How Company Choices Predictably Shape Misuse Patterns

Max Kamachee, Stephen Casper, Michelle L. Ding, Rui-Jie Yew, Anka Reuel, Stella Biderman, Dylan Hadfield-Menell

TL;DR

The paper examines how open-weight AI video models and distribution platforms contribute to non-consensual and CSAM-related harms, drawing parallels with the 2022 image-generation surge. It analyzes historical breakthroughs, current video-generation ecosystems, and safeguards literature to show that a small set of models and platforms drive NSFW content, and that risk management by developers and distributors can meaningfully curb harm. It argues for data curation, post-training unlearning, evaluations, and staged deployments, while noting widespread under-reporting of mitigations by developers and uneven platform enforcement. The findings inform policy and industry practices, highlighting that proactive risk mitigation can reduce misuse without foreclosing the benefits of powerful open-weight models.

Abstract

In 2022, AI image generators crossed a key threshold, enabling much more efficient and dynamic production of photorealistic deepfake images than before. This enabled opportunities for creative and positive uses of these models. However, it also enabled unprecedented opportunities for the low-effort creation of AI-generated non-consensual intimate imagery (AIG-NCII), including AI-generated child sexual abuse material (AIG-CSAM). Empirically, these harms were principally enabled by a small number of models that were trained on web data with pornographic content, released with open weights, and insufficiently safeguarded. In this paper, we observe ways in which the same patterns are emerging with video generation models in 2025. Specifically, we analyze how a small number of open-weight AI video generation models have become the dominant tools for videorealistic AIG-NCII video generation. We then analyze the literature on model safeguards and conclude that (1) developers who openly release the weights of capable video generation models without appropriate data curation and/or post-training safeguards foreseeably contribute to mitigatable downstream harm, and (2) model distribution platforms that do not proactively moderate individual misuse or models designed for AIG-NCII foreseeably amplify this harm. While there are no perfect defenses against AIG-NCII and AIG-CSAM from open-weight AI models, we argue that risk management by model developers and distributors, informed by emerging safeguard techniques, will substantially affect the future ease of creating AIG-NCII and AIG-CSAM with generative AI video tools.

Video Deepfake Abuse: How Company Choices Predictably Shape Misuse Patterns

TL;DR

The paper examines how open-weight AI video models and distribution platforms contribute to non-consensual and CSAM-related harms, drawing parallels with the 2022 image-generation surge. It analyzes historical breakthroughs, current video-generation ecosystems, and safeguards literature to show that a small set of models and platforms drive NSFW content, and that risk management by developers and distributors can meaningfully curb harm. It argues for data curation, post-training unlearning, evaluations, and staged deployments, while noting widespread under-reporting of mitigations by developers and uneven platform enforcement. The findings inform policy and industry practices, highlighting that proactive risk mitigation can reduce misuse without foreclosing the benefits of powerful open-weight models.

Abstract

In 2022, AI image generators crossed a key threshold, enabling much more efficient and dynamic production of photorealistic deepfake images than before. This enabled opportunities for creative and positive uses of these models. However, it also enabled unprecedented opportunities for the low-effort creation of AI-generated non-consensual intimate imagery (AIG-NCII), including AI-generated child sexual abuse material (AIG-CSAM). Empirically, these harms were principally enabled by a small number of models that were trained on web data with pornographic content, released with open weights, and insufficiently safeguarded. In this paper, we observe ways in which the same patterns are emerging with video generation models in 2025. Specifically, we analyze how a small number of open-weight AI video generation models have become the dominant tools for videorealistic AIG-NCII video generation. We then analyze the literature on model safeguards and conclude that (1) developers who openly release the weights of capable video generation models without appropriate data curation and/or post-training safeguards foreseeably contribute to mitigatable downstream harm, and (2) model distribution platforms that do not proactively moderate individual misuse or models designed for AIG-NCII foreseeably amplify this harm. While there are no perfect defenses against AIG-NCII and AIG-CSAM from open-weight AI models, we argue that risk management by model developers and distributors, informed by emerging safeguard techniques, will substantially affect the future ease of creating AIG-NCII and AIG-CSAM with generative AI video tools.

Paper Structure

This paper contains 23 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The supply chain for open-weight AI models capable of creating non-consensual intimate video deepfakes. Models flow from developers through distribution platforms to modifiers, who create specialized variants that power user-facing applications (e.g., undressing applications). Individual actors with technical expertise can also directly download models from distribution platforms and create AIG-NCII locally. The modification and redistribution cycle (highlighted in orange) shows how models with openly available weights can undergo multiple rounds of modification and be re-uploaded to model distribution platforms. Developers and distribution platforms serve as critical bottlenecks. Scale indicators show the rough number of actors at each stage.
  • Figure 2: Which open-weight video generation models are the most disproportionately used to create NSFW content online? We analyze model search hits on subreddits (left), model search hits on CivitAI and a CivitAI model archive site (middle), and video search hits on CivitAI (right). In each analysis, we report the SFW market share, the NSFW market share, and the NSFW/SFW market share ratio. The first two columns of each grid sum to 100%. Some models, including Wan2.x, stable-video-diffusion, HunyuanVideo, and LTX-Video are disproportionately used to generate NSFW content.
  • Figure 3: Stable Diffusion 2.x models offer an empirical example of models trained without NSFW data being used less for generating NSFW content. On CivitAI, Stable Diffusion 1.x has over 1,000x more tagged NSFW images and over 2x more NSFW fine-tuned models compared to Stable Diffusion 2.x. (Left) NSFW image searches on CivitAI show Stable Diffusion 1.x dominates with 100,000+ results for "NSFW" and tens of thousands for other terms, while Stable Diffusion 2.x returns fewer than 100 results per term. (Right) Model count analysis reveals Stable Diffusion 1.x has 37,075 NSFW models compared to Stable Diffusion 2.x's 186 models (199x difference), with NSFW/SFW ratios of 0.49 and 0.20, respectively. Model counts obtained by searching CivitAI with base model filters for each Stable Diffusion version. Stable Diffusion 2.0's substantial reduction in both NSFW content and models offers evidence that training data filtering can effectively reduce a model's utility for NSFW generation without eliminating broader capabilities.