Table of Contents
Fetching ...

PushPull-Net: Inhibition-driven ResNet robust to image corruptions

Guru Swaroop Bennabhaktula, Enrique Alegre, Nicola Strisciuglio, George Azzopardi

Abstract

We introduce a novel computational unit, termed PushPull-Conv, in the first layer of a ResNet architecture, inspired by the anti-phase inhibition phenomenon observed in the primary visual cortex. This unit redefines the traditional convolutional layer by implementing a pair of complementary filters: a trainable push kernel and its counterpart, the pull kernel. The push kernel (analogous to traditional convolution) learns to respond to specific stimuli, while the pull kernel reacts to the same stimuli but of opposite contrast. This configuration enhances stimulus selectivity and effectively inhibits response in regions lacking preferred stimuli. This effect is attributed to the push and pull kernels, which produce responses of comparable magnitude in such regions, thereby neutralizing each other. The incorporation of the PushPull-Conv into ResNets significantly increases their robustness to image corruption. Our experiments with benchmark corruption datasets show that the PushPull-Conv can be combined with other data augmentation techniques to further improve model robustness. We set a new robustness benchmark on ResNet50 achieving an $mCE$ of 49.95$\%$ on ImageNet-C when combining PRIME augmentation with PushPull inhibition.

PushPull-Net: Inhibition-driven ResNet robust to image corruptions

Abstract

We introduce a novel computational unit, termed PushPull-Conv, in the first layer of a ResNet architecture, inspired by the anti-phase inhibition phenomenon observed in the primary visual cortex. This unit redefines the traditional convolutional layer by implementing a pair of complementary filters: a trainable push kernel and its counterpart, the pull kernel. The push kernel (analogous to traditional convolution) learns to respond to specific stimuli, while the pull kernel reacts to the same stimuli but of opposite contrast. This configuration enhances stimulus selectivity and effectively inhibits response in regions lacking preferred stimuli. This effect is attributed to the push and pull kernels, which produce responses of comparable magnitude in such regions, thereby neutralizing each other. The incorporation of the PushPull-Conv into ResNets significantly increases their robustness to image corruption. Our experiments with benchmark corruption datasets show that the PushPull-Conv can be combined with other data augmentation techniques to further improve model robustness. We set a new robustness benchmark on ResNet50 achieving an of 49.95 on ImageNet-C when combining PRIME augmentation with PushPull inhibition.
Paper Structure (16 sections, 10 equations, 8 figures, 4 tables)

This paper contains 16 sections, 10 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: The proposed approach involves substituting the first convolutional layer with PushPull-Conv, enhancing robustness with minimal computational overhead.
  • Figure 2: The proposed push-pull computation unit. Refer \ref{['eq: pupu_unit']}.
  • Figure 3: (Top row) Five randomly selected filters from the conv1 layer of ResNet50 (trained on ImageNet). For a visual illustration, only the first filter channel is depicted. (Bottom row) The corresponding pull kernels as determined by \ref{['eq: scaled_pull_kernel_simplified']}.
  • Figure 4: Illustration of a push-pull filtering with a simulated input corrupted by Gaussian noise. The push kernel is learned by ResNet50. SNR computed as $20\log_{10}(A_{s}/A_n)$, where $A_s$ is the average across a 5-pixel vertical edge, and $A_n$ is the average response in the background.
  • Figure 5: Fourier spectral analysis of push-pull filters in ResNet50. The averaged push spectrum of the 64 push kernels is presented in the top right, while the corresponding pull spectrum is shown at the bottom right. These push and pull spectrum are then combined with various choices of hyperparameters $\alpha$ and AvgPool, and the resulting push-pull spectra are displayed on the left. For visual illustration, only the first filter channel is depicted.
  • ...and 3 more figures