How Deep Learning Sees the World: A Survey on Adversarial Attacks & Defenses
Joana C. Costa, Tiago Roxo, Hugo Proença, Pedro R. M. Inácio
TL;DR
This survey addresses the vulnerability of deep neural networks to adversarial perturbations in visual tasks, organizing attacks by attacker capacity and defenses into six domains, with a special focus on Vision Transformers. It compiles a comprehensive taxonomy of white-box, universal, black-box, and Auto-Attack–style attacks, and synthesizes defenses including adversarial training, training-process modifications, supplementary networks, architecture changes, validation, and purification. The work consolidates datasets and evaluation metrics, reporting state-of-the-art results on CIFAR-10/100 and ImageNet and outlining open issues such as robust evaluation across black-box and real-world scenarios. These insights offer practical guidance for deploying robust models and direct future research toward scalable, real-time defenses and standardized benchmarking.
Abstract
Deep Learning is currently used to perform multiple tasks, such as object recognition, face recognition, and natural language processing. However, Deep Neural Networks (DNNs) are vulnerable to perturbations that alter the network prediction (adversarial examples), raising concerns regarding its usage in critical areas, such as self-driving vehicles, malware detection, and healthcare. This paper compiles the most recent adversarial attacks, grouped by the attacker capacity, and modern defenses clustered by protection strategies. We also present the new advances regarding Vision Transformers, summarize the datasets and metrics used in the context of adversarial settings, and compare the state-of-the-art results under different attacks, finishing with the identification of open issues.
