Topological safeguard for evasion attack interpreting the neural networks' behavior

Xabier Echeberria-Barrio; Amaia Gil-Lerchundi; Iñigo Mendialdua; Raul Orduna-Urrutia

Topological safeguard for evasion attack interpreting the neural networks' behavior

Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Iñigo Mendialdua, Raul Orduna-Urrutia

TL;DR

This work tackles evasion attacks on deep neural networks by exploiting the classifier's topology. It introduces a behavior-graph representation of the classifier, derives four neuron attributes from this topology, and uses a Graph Convolutional Network-based detector trained on a rich, graph-informed preprocessing pipeline. Empirical results on a breast cancer dataset show competitive detection performance, with the Influence attribute often providing strong signal and BIM attacks being most detectable. The approach demonstrates that topology-aware detectors can outperform existing activation-based detectors and opens avenues for interpretable, topology-driven defenses in adversarial settings.

Abstract

In the last years, Deep Learning technology has been proposed in different fields, bringing many advances in each of them, but identifying new threats in these solutions regarding cybersecurity. Those implemented models have brought several vulnerabilities associated with Deep Learning technology. Moreover, those allow taking advantage of the implemented model, obtaining private information, and even modifying the model's decision-making. Therefore, interest in studying those vulnerabilities/attacks and designing defenses to avoid or fight them is gaining prominence among researchers. In particular, the widely known evasion attack is being analyzed by researchers; thus, several defenses to avoid such a threat can be found in the literature. Since the presentation of the L-BFG algorithm, this threat concerns the research community. However, it continues developing new and ingenious countermeasures since there is no perfect defense for all the known evasion algorithms. In this work, a novel detector of evasion attacks is developed. It focuses on the information of the activations of the neurons given by the model when an input sample is injected. Moreover, it puts attention to the topology of the targeted deep learning model to analyze the activations according to which neurons are connecting. This approach has been decided because the literature shows that the targeted model's topology contains essential information about if the evasion attack occurs. For this purpose, a huge data preprocessing is required to introduce all this information in the detector, which uses the Graph Convolutional Neural Network (GCN) technology. Thus, it understands the topology of the target model, obtaining promising results and improving the outcomes presented in the literature related to similar defenses.

Topological safeguard for evasion attack interpreting the neural networks' behavior

TL;DR

Abstract

Paper Structure (22 sections, 18 equations, 13 figures, 5 tables)

This paper contains 22 sections, 18 equations, 13 figures, 5 tables.

Introduction
Literature Review
Methodology
Behavior Graph
Classifier Node Attributes
Impact
Influence
Input Proportion
Specialization
Data preprocessing
Evasion attack detector
Experiment
Scenario
Experimental process
Results and Discussion
...and 7 more sections

Figures (13)

Figure 1: Diagram of the processes that are followed in this work to develop the detector against the evasion attack.
Figure 2: A classifier's behavior graph example, where the central cluster corresponds to the input layer of the classifier, the surrounding nodes are the neurons of the hidden layer, and the right last nodes correspond to the output layer. Notice that the neurons without any connection do not appear. For example, even though the input neuron contains more neurons, they are not visualized in the graph because their activations are zero.
Figure 3: Classifier Behavior Graph example, where the nodes are coloring according to their impact attribute
Figure 4: Classifier Behavior Graph example, where the nodes are coloring according to their influence attribute
Figure 5: Classifier Behavior Graph example, where the nodes are coloring according to their input proportion attribute
...and 8 more figures

Theorems & Definitions (5)

Definition 1: Behavior Graph
Definition 2: Impact
Definition 3: Influence
Definition 4: Input Proportion
Definition 5: Specialization $\mathfrak{c}$

Topological safeguard for evasion attack interpreting the neural networks' behavior

TL;DR

Abstract

Topological safeguard for evasion attack interpreting the neural networks' behavior

Authors

TL;DR

Abstract

Table of Contents

Figures (13)

Theorems & Definitions (5)