TA3: Testing Against Adversarial Attacks on Machine Learning Models

Yuanzhe Jin; Min Chen

TA3: Testing Against Adversarial Attacks on Machine Learning Models

Yuanzhe Jin, Min Chen

TL;DR

With TA3, human-in-the-loop (HITL) enables human-steered attack simulation and visualization-assisted attack impact evaluation and demonstrates the importance of HITL in ML testing and the potential application of HITL to the ML testing workflows for other types of ML models and other types of adversarial attacks.

Abstract

Adversarial attacks are major threats to the deployment of machine learning (ML) models in many applications. Testing ML models against such attacks is becoming an essential step for evaluating and improving ML models. In this paper, we report the design and development of an interactive system for aiding the workflow of Testing Against Adversarial Attacks (TA3). In particular, with TA3, human-in-the-loop (HITL) enables human-steered attack simulation and visualization-assisted attack impact evaluation. While the current version of TA3 focuses on testing decision tree models against adversarial attacks based on the One Pixel Attack Method, it demonstrates the importance of HITL in ML testing and the potential application of HITL to the ML testing workflows for other types of ML models and other types of adversarial attacks.

TA3: Testing Against Adversarial Attacks on Machine Learning Models

TL;DR

Abstract

Paper Structure (14 sections, 8 equations, 10 figures, 1 table)

This paper contains 14 sections, 8 equations, 10 figures, 1 table.

Introduction
Related Work
Human-centered Artificial Intelligence
Adversarial Attack Methods
Decision Tree
A Conceptual Workflow
TA3: System Overview
TA3: Algorithm, Interaction, Statistics, and Visualization
Algorithms: Machine Learning and Attack Simulation
Interaction: Testing Steering
Statistics: Simulation Summarization
Visualization: Simulation Progression and Testing Results
Case Studies
Discussions and Conclusion

Figures (10)

Figure 1: (a) Conventional workflows for testing adversarial attack algorithms focus on the statistics and instances that confirm the successes of an attack algorithm. Although iterative testing is common, such processes are typically not mentioned in the literature and are not supported by a purposely designed user interface. (b) An ideal workflow for testing against adversarial attacks should focus on the analysis of the behaviors of the model under attack and may feature many iterative loops where ML developers need to observe input data, generate attack data, carry out testing runs, and observe results. A software tool designed to support HITL activities can provide such iterative loops with effective and efficient user controls of the testing processes and visualization of the data, models, and results in the processes.
Figure 2: The main user interface of TA3, which is an HITL-supporting tool for testing ML models against adversarial attacks. The screen is roughly divided into four areas. (a) Below the menu bar, the Data View area supports HITL activities related to the data used for testing ML models. (b) The Attack Generation View area focuses on the HITL activities for controlling the adversarial attack algorithm and observing its input and output data. (c) The Model View area enables the visualization of the testing data flowing through the model structure. (d) The Results View area provides HITL-supporting facilities for observing and interacting with different visualization plots.
Figure 3: A pop-up window showing all perturbed data objects generated by the adversarial attack algorithm. Left: A pop-up window showing all perturbed data objects. Right: A decision tree with true- and false-positive data flows.
Figure 4: Two ways to visualize the movement of the simulated attack points during a test. (a) Intuitive view in the context of the image (Image No. 75) being attacked. (b & c) Detailed progression of the attacking points during the simulation, with color-mapped pixels representing the amount of increasing and decreasing pixel brightness and a yellow background indicating a successful attack.
Figure 5: Three visualization plots for comparing (a) four models in terms of the number of cumulative successes vs. iterations, (b) compare 10 classes in terms of success rate vs. the number of iterations, and (c) compare 100 images in terms of success rate vs. the number of iterations.
...and 5 more figures

TA3: Testing Against Adversarial Attacks on Machine Learning Models

TL;DR

Abstract

TA3: Testing Against Adversarial Attacks on Machine Learning Models

Authors

TL;DR

Abstract

Table of Contents

Figures (10)