Foolbox: A Python toolbox to benchmark the robustness of machine learning models
Jonas Rauber, Wieland Brendel, Matthias Bethge
TL;DR
Foolbox addresses the challenge of evaluating model robustness to adversarial perturbations by providing a Python toolbox that standardizes the generation of adversarial examples and the benchmarking process. It offers a cross-framework interface, a modular structure, and a wide range of attacks with internal hyperparameter tuning to approximate the global minimum perturbation. The paper details the five core components (models, criteria, distances, attacks, adversarial) and a comprehensive suite of gradient-based, score-based, and decision-based attacks, enabling robust cross-model comparisons. The framework emphasizes reproducibility through explicit versioning, reporting guidelines, and a broad spectrum of criteria and distance metrics, aiming to unify and accelerate robustness research in practical settings.
Abstract
Even todays most advanced machine learning models are easily fooled by almost imperceptible perturbations of their inputs. Foolbox is a new Python package to generate such adversarial perturbations and to quantify and compare the robustness of machine learning models. It is build around the idea that the most comparable robustness measure is the minimum perturbation needed to craft an adversarial example. To this end, Foolbox provides reference implementations of most published adversarial attack methods alongside some new ones, all of which perform internal hyperparameter tuning to find the minimum adversarial perturbation. Additionally, Foolbox interfaces with most popular deep learning frameworks such as PyTorch, Keras, TensorFlow, Theano and MXNet and allows different adversarial criteria such as targeted misclassification and top-k misclassification as well as different distance measures. The code is licensed under the MIT license and is openly available at https://github.com/bethgelab/foolbox . The most up-to-date documentation can be found at http://foolbox.readthedocs.io .
