Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization

Yuzhe You; Jarvis Tse; Jian Zhao

Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization

Yuzhe You, Jarvis Tse, Jian Zhao

TL;DR

AdvEx addresses the challenge of understanding adversarial attacks in image classification by offering a web-based, multi-level interactive visualization tailored for novices. Its backend generates adversarial examples using both white-box FGSM and black-box ZOO attacks and computes embeddings via an Embedding Projector, while the frontend presents Data Projectors, an Instance-level Attack Explainer, Robustness Analyzers, a Perturbation Adjuster, and integrated tutorials. The authors derive design goals from learner and teacher feedback, implement a plug-and-play framework supporting multiple models and attacks (e.g., CIFAR-10 with VGG/ResNet pairs and TRADES), and validate effectiveness through a novice-user study and an expert interview study. Results indicate strong learning gains, high engagement, and perceived generalizability, with identified opportunities for extending to other datasets, attacks, and ML domains. Collectively, AdvEx bridges theory and practice in AML education and offers a scalable template for visualization-driven ML pedagogy and model robustness assessment.

Abstract

Adversarial machine learning (AML) studies attacks that can fool machine learning algorithms into generating incorrect outcomes as well as the defenses against worst-case attacks to strengthen model robustness. Specifically for image classification, it is challenging to understand adversarial attacks due to their use of subtle perturbations that are not human-interpretable, as well as the variability of attack impacts influenced by diverse methodologies, instance differences, and model architectures. Through a design study with AML learners and teachers, we introduce AdvEx, a multi-level interactive visualization system that comprehensively presents the properties and impacts of evasion attacks on different image classifiers for novice AML learners. We quantitatively and qualitatively assessed AdvEx in a two-part evaluation including user studies and expert interviews. Our results show that AdvEx is not only highly effective as a visualization tool for understanding AML mechanisms, but also provides an engaging and enjoyable learning experience, thus demonstrating its overall benefits for AML learners.

Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization

TL;DR

Abstract

Paper Structure (29 sections, 2 equations, 6 figures, 1 table)

This paper contains 29 sections, 2 equations, 6 figures, 1 table.

Introduction
Related Work
Adversarial Machine Learning
Visualizations of Adversarial Attacks
Visualizations for Learning ML
Design Goals
AdvEx
System Overview
Dataset and Models
Backend Pipeline
Attacker Module
Embedding Projector
Frontend User Interface
Data Projectors
Instance-level Attack Explainer
...and 14 more sections

Figures (6)

Figure 1: AdvEx user interface: (a) Robustness Analyzers that display the models' prediction accuracy pre- and post-attack; (b) Perturbation Adjuster that initiates the attack sequence with specified magnitude; (c) Data Projectors that visualize data embeddings in a 2-D latent space; (d) Instance-level Attack Explainer that displays in-depth information of the highlighted instance; (e) General Information Provider that provides more background on AdvEx and AML.
Figure 2: A schematic diagram depicting the system architecture of AdvEx. In the backend pipeline, an Attacker module performs users' choice of attacks on the image dataset, targeting models specified by users (G2). Once processed, the backend outputs are passed to the frontend interface for user interaction.
Figure 3: A user highlights and tracks a specific class from the dataset with selection mode. Under this mode, one can evaluate model performance on a dataset subset.
Figure 4: We explored a variety of visual encodings and aggregating features for the Data Projectors. We chose binned aggregation with multiple zoom levels, with an optional hexbin toggle to display the overall distribution (Fig. c). This preserves data scalability and displays global data structure without the need for high-performance devices.
Figure 5: An example of the final state of the step-by-step execution view for explaining the FGSM attack. The view progressively reveals attack elements and explanations, animated one by one to illustrate the flow of the attack process.
...and 1 more figures

Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization

TL;DR

Abstract

Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization

Authors

TL;DR

Abstract

Table of Contents

Figures (6)