Table of Contents
Fetching ...

Mitigating Low-Frequency Bias: Feature Recalibration and Frequency Attention Regularization for Adversarial Robustness

Kejia Zhang, Juanjuan Weng, Yuanzheng Cai, Zhiming Luo, Shaozi Li

TL;DR

Adversarial training often induces a low-frequency bias that underutilizes high-frequency semantic cues. The authors propose HFDR, a module that disentangles high- and low-frequency features using an SRM-based decomposition, recalibrates high-frequency cues with a three-layer network, and enforces frequency balance via FAR, allowing integration with existing AT methods. Across CIFAR-10/100, Tiny ImageNet, and Imagenette, HFDR plus AT yields consistent improvements in white-box and transfer robustness with minimal overhead, and ablations confirm the value of each component and the importance of frequency-aware regularization. This work provides a practical, plug-in approach to boost adversarial robustness by leveraging high-frequency information, with potential implications for broader frequency-domain defenses in vision.

Abstract

Ensuring the robustness of deep neural networks against adversarial attacks remains a fundamental challenge in computer vision. While adversarial training (AT) has emerged as a promising defense strategy, our analysis reveals a critical limitation: AT-trained models exhibit a bias toward low-frequency features while neglecting high-frequency components. This bias is particularly concerning as each frequency component carries distinct and crucial information: low-frequency features encode fundamental structural patterns, while high-frequency features capture intricate details and textures. To address this limitation, we propose High-Frequency Feature Disentanglement and Recalibration (HFDR), a novel module that strategically separates and recalibrates frequency-specific features to capture latent semantic cues. We further introduce frequency attention regularization to harmonize feature extraction across the frequency spectrum and mitigate the inherent low-frequency bias of AT. Extensive experiments demonstrate our method's superior performance against white-box attacks and transfer attacks, while exhibiting strong generalization capabilities across diverse scenarios.

Mitigating Low-Frequency Bias: Feature Recalibration and Frequency Attention Regularization for Adversarial Robustness

TL;DR

Adversarial training often induces a low-frequency bias that underutilizes high-frequency semantic cues. The authors propose HFDR, a module that disentangles high- and low-frequency features using an SRM-based decomposition, recalibrates high-frequency cues with a three-layer network, and enforces frequency balance via FAR, allowing integration with existing AT methods. Across CIFAR-10/100, Tiny ImageNet, and Imagenette, HFDR plus AT yields consistent improvements in white-box and transfer robustness with minimal overhead, and ablations confirm the value of each component and the importance of frequency-aware regularization. This work provides a practical, plug-in approach to boost adversarial robustness by leveraging high-frequency information, with potential implications for broader frequency-domain defenses in vision.

Abstract

Ensuring the robustness of deep neural networks against adversarial attacks remains a fundamental challenge in computer vision. While adversarial training (AT) has emerged as a promising defense strategy, our analysis reveals a critical limitation: AT-trained models exhibit a bias toward low-frequency features while neglecting high-frequency components. This bias is particularly concerning as each frequency component carries distinct and crucial information: low-frequency features encode fundamental structural patterns, while high-frequency features capture intricate details and textures. To address this limitation, we propose High-Frequency Feature Disentanglement and Recalibration (HFDR), a novel module that strategically separates and recalibrates frequency-specific features to capture latent semantic cues. We further introduce frequency attention regularization to harmonize feature extraction across the frequency spectrum and mitigate the inherent low-frequency bias of AT. Extensive experiments demonstrate our method's superior performance against white-box attacks and transfer attacks, while exhibiting strong generalization capabilities across diverse scenarios.
Paper Structure (29 sections, 12 equations, 9 figures, 9 tables, 1 algorithm)

This paper contains 29 sections, 12 equations, 9 figures, 9 tables, 1 algorithm.

Figures (9)

  • Figure 1: Analysis of model performance across frequency components under different adversarial perturbation strengths ($\epsilon$ = 0, 4, 8, 12) for normal training (NT, left) and adversarial training (AT, right). The x-axis shows the percentage of frequency components retained after Fourier transformation.
  • Figure 2: Predicted confidence (post-softmax probability) comparison across different frequency components for clean and adversarial (Adv) images. Results shown for normal-trained, adversarial-trained, and our proposed model. The original inputs are decomposed into high and low-frequency components via 2D Discrete Fourier Transform (DFT).
  • Figure 3: Overall framework of our proposed method. During network inference, the input feature map is processed through an SRM filter, decomposing it into low-frequency $\mathcal{X}_{(LF)}$ and high-frequency $\mathcal{X}_{(HF)}$ components. To avoid gradient masking, Gumbel-softmax generates element-wise high-frequency $A_{(HF)}$ and low-frequency $A_{(LF)}$ attention maps, facilitating feature disentanglement across frequency domains. After recalibrating high-frequency features, they integrate with low-frequency features before propagation to subsequent network layers. Simultaneously, frequency-aware attention regularization loss $L_{(FAR)}$ constrains the generated attention maps, encouraging the extraction of semantically rich high-frequency features while mitigating low-frequency bias. The model is jointly optimized through $L_{(FAR)}$ and cross-entropy loss $L_{(CE)}$.
  • Figure 4: Comparison of robust accuracy using WRN34-10 on CIFAR-10 dataset under PGD-10 and PGD-100 attacks with varying $\epsilon$ values. The $x$-axis shows the perturbation bound $\epsilon$, and the $y$-axis shows the robust accuracy (%).
  • Figure 5: Comparison of growth rate (%) and robust accuracy (%) under PGD-10 attack ($\epsilon=8$) between HFDR-AT and PGD-AT methods across increasing frequency components.
  • ...and 4 more figures