Efficient Adversarial Input Generation via Neural Net Patching

Tooba Khan; Kumar Madhukar; Subodh Vishnu Sharma

Efficient Adversarial Input Generation via Neural Net Patching

Tooba Khan, Kumar Madhukar, Subodh Vishnu Sharma

TL;DR

Efficient Adversarial Input Generation via Neural Net Patching addresses robustness by generating adversarial inputs through patching neural networks rather than direct input search. It leverages an iterative edge-layer patching approach to produce small, natural perturbations by translating weight changes into input changes via linear equations, improving scalability over prior methods. Experiments on MNIST, CIFAR-10, and ImageNet show Aigent achieves superior naturalness (FID scores as low as 0.001) and requires only a small fraction of pixels to be altered, while maintaining competitive or better defect detection and transferability. This work highlights a promising direction for generating high-quality adversarial inputs and facilitating adversarial training for robustness.

Abstract

The generation of adversarial inputs has become a crucial issue in establishing the robustness and trustworthiness of deep neural nets, especially when they are used in safety-critical application domains such as autonomous vehicles and precision medicine. However, the problem poses multiple practical challenges, including scalability issues owing to large-sized networks, and the generation of adversarial inputs that lack important qualities such as naturalness and output-impartiality. This problem shares its end goal with the task of patching neural nets where small changes in some of the network's weights need to be discovered so that upon applying these changes, the modified net produces the desirable output for a given set of inputs. We exploit this connection by proposing to obtain an adversarial input from a patch, with the underlying observation that the effect of changing the weights can also be brought about by changing the inputs instead. Thus, this paper presents a novel way to generate input perturbations that are adversarial for a given network by using an efficient network patching technique. We note that the proposed method is significantly more effective than the prior state-of-the-art techniques.

Efficient Adversarial Input Generation via Neural Net Patching

TL;DR

Abstract

Paper Structure (3 sections, 20 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 3 sections, 20 equations, 7 figures, 3 tables, 1 algorithm.

Simplifying DNN Modification Constraints
Benchmark datasets
Metrics of Evaluation

Figures (7)

Figure 1: Adversarial inputs from first-layer modification. The adversarial input and the corresponding values of each neuron are written in red. The modification required in first layer weights are shown in dotted boxes.
Figure 2: Middle-layer modification and sub-net extraction
Figure 3: Example illustrating DNN modification, from modifications:1. Red indicated decrement neuron and green indicates increment neuron.
Figure 4: DNN modification constraints for Fig. \ref{['fig3']}
Figure 5: Examples of adversarial images generated by other techniques lacking originality.
...and 2 more figures

Efficient Adversarial Input Generation via Neural Net Patching

TL;DR

Abstract

Efficient Adversarial Input Generation via Neural Net Patching

Authors

TL;DR

Abstract

Table of Contents

Figures (7)