Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

Elie Attias; Cengiz Pehlevan; Dina Obeid

Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

Elie Attias, Cengiz Pehlevan, Dina Obeid

TL;DR

The paper tackles adversarial robustness in CNNs by revisiting a brain-inspired regularizer and proposing a practical, neural-data-free variant based on pixel similarities. It preserves the core regularization objective and demonstrates robustness gains against several black-box attacks and common corruptions, including grayscale and color CIFAR datasets, with low computational overhead. The key insight is that pixel-based similarities can substitute neural target structures, yielding performance comparable to neural-data regularization while avoiding data collection burdens. While not beating state-of-the-art specialized defenses, the work highlights that brain-inspired principles can be exploited in simple, scalable ways, potentially enabling future hybrids that bring model robustness closer to human-level performance without complex pipelines.

Abstract

Convolutional Neural Networks (CNNs) excel in many visual tasks but remain susceptible to adversarial attacks-imperceptible perturbations that degrade performance. Prior research reveals that brain-inspired regularizers, derived from neural recordings, can bolster CNN robustness; however, reliance on specialized data limits practical adoption. We revisit a regularizer proposed by Li et al. (2019) that aligns CNN representations with neural representational similarity structures and introduce a data-driven variant. Instead of a neural recording-based similarity, our method computes a pixel-based similarity directly from images. This substitution retains the original biologically motivated loss formulation, preserving its robustness benefits while removing the need for neural measurements or task-specific augmentations. Notably, this data-driven variant provides the same robustness improvements observed with neural data. Our approach is lightweight and integrates easily into standard pipelines. Although we do not surpass cutting-edge specialized defenses, we show that neural representational insights can be leveraged without direct recordings. This underscores the promise of robust yet simple methods rooted in brain-inspired principles, even without specialized data, and raises the possibility that further integrating these insights could push performance closer to human levels without resorting to complex, specialized pipelines.

Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

TL;DR

Abstract

Paper Structure (18 sections, 7 equations, 24 figures, 3 tables)

This paper contains 18 sections, 7 equations, 24 figures, 3 tables.

Introduction
Related work
Motivation and Method
Datasets and Experiments
Evaluating robustness
Frequency decomposition of adversarial perturbations and common corruptions
Computational advantage
Hyperparameter selection
Conclusion and Discussion
Experimental setup
Robustness to random noise for models trained to classify CIFAR-10 regularized using $S^{\text{pixel}}$
Robustness to random noise for models trained to classify CIFAR-10 regularized using $S^{Th}$
Robustness on image classification task for different classification-regularization datasets
Hyperparameters used for regularization
Weighting candidate layers
...and 3 more sections

Figures (24)

Figure 1: Relationship between image pixel similarity and V1-based representational similarity. (a) Correlation between pixel similarity ($S^{pixel}$) and V1-based representational similarity ($S^{\text{neural-pred}}$) computed from the predictive model of li2019learning (Pearson’s $r = 0.83$). Pixel similarity is measured as the cosine similarity between flattened, mean-subtracted, and normalized pixel vectors. (b) Pixel similarity matrix. (c) V1-based representational similarity matrix, obtained by averaging similarities across predictive models trained on six distinct neural scans. Both matrices share similar global structure, though the V1-based matrix spans a wider dynamic range. Additional details are provided in the supplementary material.
Figure 2: Robustness of ResNet18 trained to classify grayscale CIFAR-10 and regularized with images from grayscale ImageNet dataset. Robustness to (a) Gaussian noise, (b) transfer-based FGSM goodfellow2014explaining, and (c) decision-based Boundary Attack brendel2017decision. For comparison, results from using different regularization targets in $L_{sim}$ are shown : $S^{pixel}$, $S^{Th}$ and $S^{neural-pred}$-the neural-based targets as in li2019learning (see Section \ref{['method']}). For the decision-based Boundary Attack, we compute the median squared $L_2$ perturbation size per pixel, averaged across 1000 images, and 5 repeats. Error shades represent the SEM across seven seeds per model.
Figure 3: Robustness of a ResNet18 trained to classify grayscale CIFAR-10 regularized on grayscale images from different datasets : CIFAR-10 (blue), CIFAR-100 (purple) or ImageNet (red). For the decision-based Boundary Attack, we compute the median $L_2$ perturbation size, averaged across 1000 images, and 5 repeats. Error shades/bars represent the SEM across seven seeds per model. The same ($\alpha$,$Th$) values are used in training all models i.e., for all regularization datasets (see the supplementary material).
Figure 4: Robustness of a ResNet18 trained to classify color CIFAR-10 regularized on color images from different datasets : CIFAR-10 (blue), CIFAR-100 (purple) or ImageNet (red). For the decision-based Boundary Attack, we compute the median $L_2$ perturbation size, averaged across 1000 images, and 5 repeats. Error shades/bars represent the SEM across seven seeds per model.
Figure 5: Robustness to grayscale CIFAR-10-C corruptions hendrycks2019benchmarking. Results for regularized and unregularized models. (a) Accuracy across severity levels, averaged over all 15 corruption types. (b) Performance on the 15 individual corruptions at severity 4. Error bars show SEM over seven seeds. Models are ResNet-18 trained on grayscale CIFAR-10 regularized with grayscale ImageNet.
...and 19 more figures

Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

TL;DR

Abstract

Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

Authors

TL;DR

Abstract

Table of Contents

Figures (24)