Table of Contents
Fetching ...

Dynamics-aware Adversarial Attack of Adaptive Neural Networks

An Tao, Yueqi Duan, Yingqi Wang, Jiwen Lu, Jie Zhou

TL;DR

This paper reformulates the gradients to be aware of the potential dynamic changes of network architectures, so that the learned attack better leads the next step than the dynamics-unaware methods when network architecture changes dynamically.

Abstract

In this paper, we investigate the dynamics-aware adversarial attack problem of adaptive neural networks. Most existing adversarial attack algorithms are designed under a basic assumption -- the network architecture is fixed throughout the attack process. However, this assumption does not hold for many recently proposed adaptive neural networks, which adaptively deactivate unnecessary execution units based on inputs to improve computational efficiency. It results in a serious issue of lagged gradient, making the learned attack at the current step ineffective due to the architecture change afterward. To address this issue, we propose a Leaded Gradient Method (LGM) and show the significant effects of the lagged gradient. More specifically, we reformulate the gradients to be aware of the potential dynamic changes of network architectures, so that the learned attack better "leads" the next step than the dynamics-unaware methods when network architecture changes dynamically. Extensive experiments on representative types of adaptive neural networks for both 2D images and 3D point clouds show that our LGM achieves impressive adversarial attack performance compared with the dynamic-unaware attack methods. Code is available at https://github.com/antao97/LGM.

Dynamics-aware Adversarial Attack of Adaptive Neural Networks

TL;DR

This paper reformulates the gradients to be aware of the potential dynamic changes of network architectures, so that the learned attack better leads the next step than the dynamics-unaware methods when network architecture changes dynamically.

Abstract

In this paper, we investigate the dynamics-aware adversarial attack problem of adaptive neural networks. Most existing adversarial attack algorithms are designed under a basic assumption -- the network architecture is fixed throughout the attack process. However, this assumption does not hold for many recently proposed adaptive neural networks, which adaptively deactivate unnecessary execution units based on inputs to improve computational efficiency. It results in a serious issue of lagged gradient, making the learned attack at the current step ineffective due to the architecture change afterward. To address this issue, we propose a Leaded Gradient Method (LGM) and show the significant effects of the lagged gradient. More specifically, we reformulate the gradients to be aware of the potential dynamic changes of network architectures, so that the learned attack better "leads" the next step than the dynamics-unaware methods when network architecture changes dynamically. Extensive experiments on representative types of adaptive neural networks for both 2D images and 3D point clouds show that our LGM achieves impressive adversarial attack performance compared with the dynamic-unaware attack methods. Code is available at https://github.com/antao97/LGM.
Paper Structure (29 sections, 33 equations, 10 figures, 11 tables)

This paper contains 29 sections, 33 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: An illustration of the benefit of dynamics-aware attack on 3D sparse convolution network for point clouds. If a voxel does not contain point(s) and disappears after one attack step, the convolution on this voxel will become invalid. This change in network architecture causes the learned perturbation by the dynamics-unaware attack may not be efficient to the changed new architecture. Instead, our dynamics-aware attack considers the dynamic change of the positions of convolution kernels after attack and achieves remarkably lower mIoU on the presented point cloud scene.
  • Figure 2: An illustration of the ineffective attack in both the (a) continuous and (b) discontinuous mapping between the network's input $x$ and output logit $F_{\rm gt}(x)$ on the ground-truth class. In this paper, we summarize the gradient issue in (b) as the lagged gradient caused by the network architecture change in adaptive neural networks. We propose a dynamics-aware attack that reconstructs the mapping to reformulate the back-propagated gradient as the leaded gradient in (c) to be aware of the potential network architecture change. With the guide of the leaded gradient in (c), we obtain $x_{\rm adv}'$ in (b) which satisfies $F_{\rm gt}(x_{\rm adv}')<F_{\rm gt}(x_0)$. In this figure, we consider the input $x$ as a one-dimensional variable for simplicity and try to decrease the output logit $F_{\rm gt}(x)$ through the attack.
  • Figure 3: An illustration of the conventional static network that feeds the data through all layers, layer skipping network in (\ref{['skipnet']}) that adopts the module in (\ref{['skip']}), generalized layer skipping network that adopts the module in (\ref{['skip_gen']}), and our dynamics-aware layer skipping network that adopts the module in (\ref{['skip_gen2']}) with soft occupancy. In the figure, we show the network that only contains three skippable layers for simplicity.
  • Figure 4: An illustration of conventional 2D convolution that operates on every pixels, 2D sparse convolution in (\ref{['2dconv']}), generalized 2D sparse convolution in (\ref{['2dconv2']}), and our dynamics-aware 3D sparse convolution in (\ref{['2dconv3']}) with soft occupancy. We only draw the convolution center in this figure for concise, and the color shade indicates the magnitude of the convolution's occupancy value. For simplicity, we draw the pixel-wise grid size as $6\times 6$.
  • Figure 5: An illustration of conventional 3D convolution that operates on every voxels, 3D sparse convolution in (\ref{['conv']}), generalized 3D sparse convolution in (\ref{['conv3']}), and our dynamics-aware 3D sparse convolution in (\ref{['conv2']}) with soft occupancy in (\ref{['o_hat']}). We only draw the convolution center in this figure for concise, and the color shade indicates the magnitude of the convolution's occupancy value.
  • ...and 5 more figures