A Training Rate and Survival Heuristic for Inference and Robustness Evaluation (TRASHFIRE)

Charles Meyers; Mohammad Reza Saleh Sedghpour; Tommy Löfstedt; Erik Elmroth

A Training Rate and Survival Heuristic for Inference and Robustness Evaluation (TRASHFIRE)

Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth

TL;DR

The present work addresses the problem of understanding and predicting how particular model hyper-parameters influence the performance of a model in the presence of an adversary byUsing the proposed methodology, it is shown that ResNet is hopelessly insecure against even the simplest of white box attacks.

Abstract

Machine learning models -- deep neural networks in particular -- have performed remarkably well on benchmark datasets across a wide variety of domains. However, the ease of finding adversarial counter-examples remains a persistent problem when training times are measured in hours or days and the time needed to find a successful adversarial counter-example is measured in seconds. Much work has gone into generating and defending against these adversarial counter-examples, however the relative costs of attacks and defences are rarely discussed. Additionally, machine learning research is almost entirely guided by test/train metrics, but these would require billions of samples to meet industry standards. The present work addresses the problem of understanding and predicting how particular model hyper-parameters influence the performance of a model in the presence of an adversary. The proposed approach uses survival models, worst-case examples, and a cost-aware analysis to precisely and accurately reject a particular model change during routine model training procedures rather than relying on real-world deployment, expensive formal verification methods, or accurate simulations of very complicated systems (\textit{e.g.}, digitally recreating every part of a car or a plane). Through an evaluation of many pre-processing techniques, adversarial counter-examples, and neural network configurations, the conclusion is that deeper models do offer marginal gains in survival times compared to more shallow counterparts. However, we show that those gains are driven more by the model inference time than inherent robustness properties. Using the proposed methodology, we show that ResNet is hopelessly insecure against even the simplest of white box attacks.

A Training Rate and Survival Heuristic for Inference and Robustness Evaluation (TRASHFIRE)

TL;DR

Abstract

Paper Structure (31 sections, 29 equations, 4 figures, 1 table)

This paper contains 31 sections, 29 equations, 4 figures, 1 table.

Introduction
Motivations
Contributions
Background
Adversarial Attacks
Accuracy and Failure Rate
Cost
Survival Analysis for ML
The Cox Proportional Hazard Model
Accelerated Failure Time Models
Exponential
Weibull
Log-Normal
Log-Logistic
Generalised Gamma
...and 16 more sections

Figures (4)

Figure 1: The adversarial accuracy across various attacks pictured on the first axis and outlined in Section \ref{['attacks']}. The error bars reflect 95% confidence intervals for the adversarial accuracy across all examined samples. The violin plots reflect 95% confidence intervals for each tuned hyperparameter combination. Outliers are indicated with a circle.
Figure 2: These quantile-quantile plots demonstrate the efficacy of various AFT models. The first axis is the observed quantile of a sample and the second axis represents the theoretical quantile according to the chosen AFT model. The dashed black line represents a perfect fit. To verify each model, we reserved 80% of the data to be the training set (blue) and 20% to be the test set (red). The time for each model was chosen to depict the best fit of the curve when the time to failure, $t$, is $[ 0 \geq t \leq 10 ]$ seconds.
Figure 3: The coefficients represent the log scale effect of the dummy variables for dataset (Data), attack (Atk), and defence (Def) on the survival time, with a positive value indicating an increase in the survival time. The right plot depicts the covariates and the left plot depicts the dummy variables for the different attacks, defences, and datasets.
Figure 4: This figure depicts the TRASH metric that reflects the ratio of training-to-attack times, where a value $\gg 1$ indicates an essential advantage for the attacker. The violin plots reflect the 95% confidence intervals for each tuned hyperparameter combination. Outliers are indicated with a circle.

A Training Rate and Survival Heuristic for Inference and Robustness Evaluation (TRASHFIRE)

TL;DR

Abstract

A Training Rate and Survival Heuristic for Inference and Robustness Evaluation (TRASHFIRE)

Authors

TL;DR

Abstract

Table of Contents

Figures (4)