Table of Contents
Fetching ...

Enhancing Adversarial Transferability via Information Bottleneck Constraints

Biqing Qi, Junqi Gao, Jianxing Liu, Ligang Wu, Bowen Zhou

TL;DR

The paper addresses the limited transferability of adversarial attacks in black-box settings by introducing IBTA, an information-bottleneck based framework that emphasizes invariance to non-essential input features. It derives a simple mutual information lower bound MILB to approximate $I(\mathcal{E}; X \mid Y_{adv})$ and utilizes MINE to quantify MI, enabling scalable optimization. Empirical results on ImageNet show consistent transferability improvements across non-targeted and targeted attacks when IBTA is integrated with existing methods, including in ensemble and adversarially trained scenarios. The work provides a principled approach to concentrating perturbations on invariant, class-relevant features, with practical implications for evaluating defenses and strengthening robustness research, and it releases code for reproducibility.

Abstract

From the perspective of information bottleneck (IB) theory, we propose a novel framework for performing black-box transferable adversarial attacks named IBTA, which leverages advancements in invariant features. Intuitively, diminishing the reliance of adversarial perturbations on the original data, under equivalent attack performance constraints, encourages a greater reliance on invariant features that contributes most to classification, thereby enhancing the transferability of adversarial attacks. Building on this motivation, we redefine the optimization of transferable attacks using a novel theoretical framework that centers around IB. Specifically, to overcome the challenge of unoptimizable mutual information, we propose a simple and efficient mutual information lower bound (MILB) for approximating computation. Moreover, to quantitatively evaluate mutual information, we utilize the Mutual Information Neural Estimator (MINE) to perform a thorough analysis. Our experiments on the ImageNet dataset well demonstrate the efficiency and scalability of IBTA and derived MILB. Our code is available at https://github.com/Biqing-Qi/Enhancing-Adversarial-Transferability-via-Information-Bottleneck-Constraints.

Enhancing Adversarial Transferability via Information Bottleneck Constraints

TL;DR

The paper addresses the limited transferability of adversarial attacks in black-box settings by introducing IBTA, an information-bottleneck based framework that emphasizes invariance to non-essential input features. It derives a simple mutual information lower bound MILB to approximate and utilizes MINE to quantify MI, enabling scalable optimization. Empirical results on ImageNet show consistent transferability improvements across non-targeted and targeted attacks when IBTA is integrated with existing methods, including in ensemble and adversarially trained scenarios. The work provides a principled approach to concentrating perturbations on invariant, class-relevant features, with practical implications for evaluating defenses and strengthening robustness research, and it releases code for reproducibility.

Abstract

From the perspective of information bottleneck (IB) theory, we propose a novel framework for performing black-box transferable adversarial attacks named IBTA, which leverages advancements in invariant features. Intuitively, diminishing the reliance of adversarial perturbations on the original data, under equivalent attack performance constraints, encourages a greater reliance on invariant features that contributes most to classification, thereby enhancing the transferability of adversarial attacks. Building on this motivation, we redefine the optimization of transferable attacks using a novel theoretical framework that centers around IB. Specifically, to overcome the challenge of unoptimizable mutual information, we propose a simple and efficient mutual information lower bound (MILB) for approximating computation. Moreover, to quantitatively evaluate mutual information, we utilize the Mutual Information Neural Estimator (MINE) to perform a thorough analysis. Our experiments on the ImageNet dataset well demonstrate the efficiency and scalability of IBTA and derived MILB. Our code is available at https://github.com/Biqing-Qi/Enhancing-Adversarial-Transferability-via-Information-Bottleneck-Constraints.
Paper Structure (9 sections, 1 theorem, 9 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 9 sections, 1 theorem, 9 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

(Theorem 3 in belghazi2018mutual) Given any values $\xi$, $\delta$ of the desired accuracy and confidence parameters, we have whenever the number $n$ of samples satisfies

Figures (3)

  • Figure 1: The values of computed $I_\varphi(\mathcal{E};X|Y_{adv})$ under targeted and non-targeted settings, with "+IBTA" indicating the addition of IBTA during the iteration process.
  • Figure 2: Transfer success rates of MIM+IBTA with different parameter settings in the untargeted scenario.
  • Figure 3: Transfer success rates of MIM+IBTA with different parameter settings in the targeted scenario.

Theorems & Definitions (1)

  • Theorem 1