Table of Contents
Fetching ...

Towards Robust Protective Perturbation against DeepFake Face Swapping

Hengyang Yao, Lin Li, Ke Sun, Jianing Qiu, Huiping Chen

TL;DR

The paper addresses the fragility of proactive defenses against DeepFake face swapping under image transformations. It introduces EOLT, a reinforcement-learning-based framework that learns a transformation policy to bias protective perturbation generation, replacing uniform sampling used in EOT. Through comprehensive experiments on FFHQ with 30 transformations, EOLT achieves ~26% higher average robustness and strong generalization to unseen transforms, with notable gains in Stylization. This work advances practical deployment of robust protections by explicitly modeling and optimizing transformation bottlenecks.

Abstract

DeepFake face swapping enables highly realistic identity forgeries, posing serious privacy and security risks. A common defence embeds invisible perturbations into images, but these are fragile and often destroyed by basic transformations such as compression or resizing. In this paper, we first conduct a systematic analysis of 30 transformations across six categories and show that protection robustness is highly sensitive to the choice of training transformations, making the standard Expectation over Transformation (EOT) with uniform sampling fundamentally suboptimal. Motivated by this, we propose Expectation Over Learned distribution of Transformation (EOLT), the framework to treat transformation distribution as a learnable component rather than a fixed design choice. Specifically, EOLT employs a policy network that learns to automatically prioritize critical transformations and adaptively generate instance-specific perturbations via reinforcement learning, enabling explicit modeling of defensive bottlenecks while maintaining broad transferability. Extensive experiments demonstrate that our method achieves substantial improvements over state-of-the-art approaches, with 26% higher average robustness and up to 30% gains on challenging transformation categories.

Towards Robust Protective Perturbation against DeepFake Face Swapping

TL;DR

The paper addresses the fragility of proactive defenses against DeepFake face swapping under image transformations. It introduces EOLT, a reinforcement-learning-based framework that learns a transformation policy to bias protective perturbation generation, replacing uniform sampling used in EOT. Through comprehensive experiments on FFHQ with 30 transformations, EOLT achieves ~26% higher average robustness and strong generalization to unseen transforms, with notable gains in Stylization. This work advances practical deployment of robust protections by explicitly modeling and optimizing transformation bottlenecks.

Abstract

DeepFake face swapping enables highly realistic identity forgeries, posing serious privacy and security risks. A common defence embeds invisible perturbations into images, but these are fragile and often destroyed by basic transformations such as compression or resizing. In this paper, we first conduct a systematic analysis of 30 transformations across six categories and show that protection robustness is highly sensitive to the choice of training transformations, making the standard Expectation over Transformation (EOT) with uniform sampling fundamentally suboptimal. Motivated by this, we propose Expectation Over Learned distribution of Transformation (EOLT), the framework to treat transformation distribution as a learnable component rather than a fixed design choice. Specifically, EOLT employs a policy network that learns to automatically prioritize critical transformations and adaptively generate instance-specific perturbations via reinforcement learning, enabling explicit modeling of defensive bottlenecks while maintaining broad transferability. Extensive experiments demonstrate that our method achieves substantial improvements over state-of-the-art approaches, with 26% higher average robustness and up to 30% gains on challenging transformation categories.

Paper Structure

This paper contains 26 sections, 11 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Illustration of the research problem and our method. The protective perturbation generated by the naive PGD algorithm madry_towards_2018 loses its effectiveness after the perturbed image undergoes box blur transformation, whereas our method maintains its protective effect and continues to disrupt the face-swapping output. Note that naive PGD achieves a similar protective effect as ours when no transformation is applied.
  • Figure 2: Cross-transformation identity similarity matrix. Rows indicate transformations used during perturbation generation, while columns indicate transformations applied at test time. "Clean" denotes no transformation applied.
  • Figure 3: Illustration of sampling multiple sub-policies from the policy model for a given image. Taking an input image, the policy model produces a logits vector. These logits are converted into a sub-policy probability distribution, from which multiple sub-policy indices are sampled. Each sampled index is then decoded into a transformation-intensity combination, forming the sub-policies used during EOLT optimization.
  • Figure 4: Learned probability distribution over sub-policies available for perturbation generation averaged over all images in the train split of FFHQ dataset. There are 81 sub-policies in total. Each color corresponds to 1 of the 9 training transformations in this setup, and each transformation is associated with 9 discrete intensity levels, indexed from 0 to 8 in the figure.
  • Figure 5: The performrance of PGD-EOT and our method PGD-EOLT with the number of steps used in PGD on the train split of FFHQ dataset. The figure highlights distinct optimization dynamics, with PGD-EOT stabilizing quickly while PGD-EOLT continues improving over extended iterations.
  • ...and 4 more figures