Table of Contents
Fetching ...

Adversarial Robustness Toolbox v1.0.0

Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian M. Molloy, Ben Edwards

TL;DR

<3-5 sentence high-level summary> ART provides a modular, framework-agnostic toolkit for adversarial machine learning, integrating a broad spectrum of attacks, defences, detectors, and metrics to benchmark and harden models. It supports multiple ML libraries, enables end-to-end testing from data generation to robustness evaluation, and includes mechanisms for adversarial training, preprocessing, and runtime detection. The library also addresses data poisoning detection and offers tools for certifiable robustness via randomized smoothing and robustness verification for tree ensembles. By delivering standardized interfaces, ART facilitates reproducible experimentation and secure deployment of AI systems in adversarial settings.

Abstract

Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which is required to test defenses with state-of-the-art threat models. Supported Machine Learning Libraries include TensorFlow (v1 and v2), Keras, PyTorch, MXNet, Scikit-learn, XGBoost, LightGBM, CatBoost, and GPy. The source code of ART is released with MIT license at https://github.com/IBM/adversarial-robustness-toolbox. The release includes code examples, notebooks with tutorials and documentation (http://adversarial-robustness-toolbox.readthedocs.io).

Adversarial Robustness Toolbox v1.0.0

TL;DR

<3-5 sentence high-level summary> ART provides a modular, framework-agnostic toolkit for adversarial machine learning, integrating a broad spectrum of attacks, defences, detectors, and metrics to benchmark and harden models. It supports multiple ML libraries, enables end-to-end testing from data generation to robustness evaluation, and includes mechanisms for adversarial training, preprocessing, and runtime detection. The library also addresses data poisoning detection and offers tools for certifiable robustness via randomized smoothing and robustness verification for tree ensembles. By delivering standardized interfaces, ART facilitates reproducible experimentation and secure deployment of AI systems in adversarial settings.

Abstract

Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which is required to test defenses with state-of-the-art threat models. Supported Machine Learning Libraries include TensorFlow (v1 and v2), Keras, PyTorch, MXNet, Scikit-learn, XGBoost, LightGBM, CatBoost, and GPy. The source code of ART is released with MIT license at https://github.com/IBM/adversarial-robustness-toolbox. The release includes code examples, notebooks with tutorials and documentation (http://adversarial-robustness-toolbox.readthedocs.io).

Paper Structure

This paper contains 63 sections, 21 equations, 2 figures, 1 table, 9 algorithms.

Figures (2)

  • Figure 1: Activations of the last hidden layer projected onto the first 3 principle components. (a) Activations of the poisoned class (d) Activations of the unpoisoned class.
  • Figure 2: Library visualization of clusters for class five produce by the Activation Clustering defence (see code example below). Sprite in the left contains samples in the first cluster, while sprite in the right contains samples in the second cluster. It is easy to see the second cluster contains poison data.