Table of Contents
Fetching ...

A Survey of Adversarial Defenses in Vision-based Systems: Categorization, Methods and Challenges

Nandish Chattopadhyay, Abdul Basit, Bassem Ouni, Muhammad Shafique

TL;DR

This survey analyzes adversarial defenses in vision-based systems with a focus on image classification and object detection. It categorizes defenses into model modification, re-training, pre-processing, patch-oriented approaches, and other methods, and maps each to corresponding attack types and datasets. The work highlights certifiable defenses, practical trade-offs, and the need for robust, interpretable, and transferable strategies. It also outlines promising directions like ensemble approaches, dynamic defenses, and domain-aware defenses to advance trustworthy AI in real-world deployments.

Abstract

Adversarial attacks have emerged as a major challenge to the trustworthy deployment of machine learning models, particularly in computer vision applications. These attacks have a varied level of potency and can be implemented in both white box and black box approaches. Practical attacks include methods to manipulate the physical world and enforce adversarial behaviour by the corresponding target neural network models. Multiple different approaches to mitigate different kinds of such attacks are available in the literature, each with their own advantages and limitations. In this survey, we present a comprehensive systematization of knowledge on adversarial defenses, focusing on two key computer vision tasks: image classification and object detection. We review the state-of-the-art adversarial defense techniques and categorize them for easier comparison. In addition, we provide a schematic representation of these categories within the context of the overall machine learning pipeline, facilitating clearer understanding and benchmarking of defenses. Furthermore, we map these defenses to the types of adversarial attacks and datasets where they are most effective, offering practical insights for researchers and practitioners. This study is necessary for understanding the scope of how the available defenses are able to address the adversarial threats, and their shortcomings as well, which is necessary for driving the research in this area in the most appropriate direction, with the aim of building trustworthy AI systems for regular practical use-cases.

A Survey of Adversarial Defenses in Vision-based Systems: Categorization, Methods and Challenges

TL;DR

This survey analyzes adversarial defenses in vision-based systems with a focus on image classification and object detection. It categorizes defenses into model modification, re-training, pre-processing, patch-oriented approaches, and other methods, and maps each to corresponding attack types and datasets. The work highlights certifiable defenses, practical trade-offs, and the need for robust, interpretable, and transferable strategies. It also outlines promising directions like ensemble approaches, dynamic defenses, and domain-aware defenses to advance trustworthy AI in real-world deployments.

Abstract

Adversarial attacks have emerged as a major challenge to the trustworthy deployment of machine learning models, particularly in computer vision applications. These attacks have a varied level of potency and can be implemented in both white box and black box approaches. Practical attacks include methods to manipulate the physical world and enforce adversarial behaviour by the corresponding target neural network models. Multiple different approaches to mitigate different kinds of such attacks are available in the literature, each with their own advantages and limitations. In this survey, we present a comprehensive systematization of knowledge on adversarial defenses, focusing on two key computer vision tasks: image classification and object detection. We review the state-of-the-art adversarial defense techniques and categorize them for easier comparison. In addition, we provide a schematic representation of these categories within the context of the overall machine learning pipeline, facilitating clearer understanding and benchmarking of defenses. Furthermore, we map these defenses to the types of adversarial attacks and datasets where they are most effective, offering practical insights for researchers and practitioners. This study is necessary for understanding the scope of how the available defenses are able to address the adversarial threats, and their shortcomings as well, which is necessary for driving the research in this area in the most appropriate direction, with the aim of building trustworthy AI systems for regular practical use-cases.

Paper Structure

This paper contains 50 sections, 1 equation, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Overview of the paper organization. This figure illustrates the structure and flow of the sections, highlighting the key topics discussed throughout the paper.
  • Figure 2: An example of adversarial sample exhibiting adversarial behavior.
  • Figure 3: Organization of different approaches for defenses against attacks on image classification tasks in vision based systems.
  • Figure 4: Schematic representation of the integration of the defense techniques in different parts of a standard machine learning pipeline for image classification tasks.
  • Figure 5: Defensive distillation methodology overview: Knowledge Distillation (KD) trains a student model to mimic a teacher model, reducing sensitivity to input perturbations. While effective against white-box attacks, KD remains vulnerable to black-box attacks. The proposed approach improves robustness by training the student to learn a different latent space instead of mimicking the teacher's outputs.
  • ...and 10 more figures