Adversarial Attacks and Defenses in Images, Graphs and Text: A Review
Han Xu, Yao Ma, Haochen Liu, Debayan Deb, Hui Liu, Jiliang Tang, Anil K. Jain
TL;DR
This survey provides a structured, cross-domain synthesis of adversarial attacks and defenses for images, graphs, and text. It catalogs white-box, black-box, grey-box, and poisoning attacks, alongside gradient-masking, robust optimization, and detection-based defenses, including provable guarantees. It highlights practical realities such as transferability, physical-world adversaries, and domain-specific challenges in graphs and NLP. The work emphasizes the evolving cat-and-mouse dynamic and offers guidance for building more robust, verifiable systems.
Abstract
Deep neural networks (DNN) have achieved unprecedented success in numerous machine learning tasks in various domains. However, the existence of adversarial examples has raised concerns about applying deep learning to safety-critical applications. As a result, we have witnessed increasing interests in studying attack and defense mechanisms for DNN models on different data types, such as images, graphs and text. Thus, it is necessary to provide a systematic and comprehensive overview of the main threats of attacks and the success of corresponding countermeasures. In this survey, we review the state of the art algorithms for generating adversarial examples and the countermeasures against adversarial examples, for the three popular data types, i.e., images, graphs and text.
