A Knowledge-guided Adversarial Defense for Resisting Malicious Visual Manipulation

Dawei Zhou; Suzhi Gang; Decheng Liu; Tongliang Liu; Nannan Wang; Xinbo Gao

A Knowledge-guided Adversarial Defense for Resisting Malicious Visual Manipulation

Dawei Zhou, Suzhi Gang, Decheng Liu, Tongliang Liu, Nannan Wang, Xinbo Gao

TL;DR

This work addresses the security risks of malicious visual manipulation and the shortcomings of data-only defenses by proposing a knowledge-guided adversarial defense (KGAD). KGAD jointly leverages domain-specific knowledge and visual-perception cues to generate adversarial noise that forces manipulation models to produce semantically confused outputs, improving protection across face- and style-manipulation tasks. The method optimizes a combined loss $L_{KGAD} = L_{pk} + \lambda L_{dk}$ with $L_{dk} = - \ell_d(\mathcal{K}_d(G_\theta(x)), \mathcal{K}_d(G_\theta(x+\delta)))$ and $L_{pk} = - \Delta_{pk}(G_\theta(x), G_\theta(x+\delta))$, using perceptual metrics like SSIMD/LPIPS and domain features such as keypoints or content. Experiments on CelebA and Monet2Photo with StarGAN, AGGAN, HiSD, CycleGAN, and AdaAttN demonstrate superior distortion, generalization, and transferability compared with state-of-the-art defenses, validating KGAD’s potential to mitigate real-world risks from deepfake and other malicious manipulations.

Abstract

Malicious applications of visual manipulation have raised serious threats to the security and reputation of users in many fields. To alleviate these issues, adversarial noise-based defenses have been enthusiastically studied in recent years. However, ``data-only" methods tend to distort fake samples in the low-level feature space rather than the high-level semantic space, leading to limitations in resisting malicious manipulation. Frontier research has shown that integrating knowledge in deep learning can produce reliable and generalizable solutions. Inspired by these, we propose a knowledge-guided adversarial defense (KGAD) to actively force malicious manipulation models to output semantically confusing samples. Specifically, in the process of generating adversarial noise, we focus on constructing significant semantic confusions at the domain-specific knowledge level, and exploit a metric closely related to visual perception to replace the general pixel-wise metrics. The generated adversarial noise can actively interfere with the malicious manipulation model by triggering knowledge-guided and perception-related disruptions in the fake samples. To validate the effectiveness of the proposed method, we conduct qualitative and quantitative experiments on human perception and visual quality assessment. The results on two different tasks both show that our defense provides better protection compared to state-of-the-art methods and achieves great generalizability.

A Knowledge-guided Adversarial Defense for Resisting Malicious Visual Manipulation

TL;DR

Abstract

A Knowledge-guided Adversarial Defense for Resisting Malicious Visual Manipulation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)