A Cost-Aware Approach to Adversarial Robustness in Neural Networks
Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth
TL;DR
This work addresses the challenge of evaluating adversarial robustness for neural networks in production by introducing a cloud-native, cost-aware framework based on survival analysis. It leverages accelerated failure time (AFT) models to predict time-to-failure under adversarial perturbations while jointly optimizing benign and adversarial accuracy and training-time costs using a Tree Parzen Estimator (TPE). The methodology enables comparisons across hardware (e.g., P100, V100, L4) and operational settings by tying performance to measurable times (training, inference, attack generation) and costs, encapsulated in metrics such as the TRASH score. Empirical results show that newer hardware reduces training time but with diminishing accuracy gains and that 8-bit inference-focused hardware (L4) can offer favorable cost-robustness trade-offs, reinforcing the practicality of the proposed approach for risk-aware deployment and rapid iteration under safety constraints.
Abstract
Considering the growing prominence of production-level AI and the threat of adversarial attacks that can evade a model at run-time, evaluating the robustness of models to these evasion attacks is of critical importance. Additionally, testing model changes likely means deploying the models to (e.g. a car or a medical imaging device), or a drone to see how it affects performance, making un-tested changes a public problem that reduces development speed, increases cost of development, and makes it difficult (if not impossible) to parse cause from effect. In this work, we used survival analysis as a cloud-native, time-efficient and precise method for predicting model performance in the presence of adversarial noise. For neural networks in particular, the relationships between the learning rate, batch size, training time, convergence time, and deployment cost are highly complex, so researchers generally rely on benchmark datasets to assess the ability of a model to generalize beyond the training data. To address this, we propose using accelerated failure time models to measure the effect of hardware choice, batch size, number of epochs, and test-set accuracy by using adversarial attacks to induce failures on a reference model architecture before deploying the model to the real world. We evaluate several GPU types and use the Tree Parzen Estimator to maximize model robustness and minimize model run-time simultaneously. This provides a way to evaluate the model and optimise it in a single step, while simultaneously allowing us to model the effect of model parameters on training time, prediction time, and accuracy. Using this technique, we demonstrate that newer, more-powerful hardware does decrease the training time, but with a monetary and power cost that far outpaces the marginal gains in accuracy.
