Table of Contents
Fetching ...

Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields

Zahra Babaiee, Peyman M. Kiasari, Daniela Rus, Radu Grosu

TL;DR

The paper investigates whether depthwise convolutional filters in CNNs naturally replicate biological center-surround receptive fields and whether this structure can be leveraged to improve learning. It analyzes trained kernels across multiple SOTA models, formalizes a center-surround model via Difference-of-Gaussians ($DoG$), and introduces a $DoG$-based initialization for depthwise layers that alternates excitatory and inhibitory centers with varied center-to-surround ratios ($\gamma$). The key contributions include empirical evidence of $DoG$-like center-surround motifs in depthwise kernels, a practical initialization method that improves ImageNet accuracy (notably for larger kernels), and ablation results showing the importance of ($\gamma$) and center type. This work demonstrates that bio-inspired priors can enhance CNN performance and offers a pathway to integrating neuroscience insights into architecture design for vision tasks.

Abstract

In this study, we present evidence suggesting that depthwise convolutional kernels are effectively replicating the structural intricacies of the biological receptive fields observed in the mammalian retina. We provide analytics of trained kernels from various state-of-the-art models substantiating this evidence. Inspired by this intriguing discovery, we propose an initialization scheme that draws inspiration from the biological receptive fields. Experimental analysis of the ImageNet dataset with multiple CNN architectures featuring depthwise convolutions reveals a marked enhancement in the accuracy of the learned model when initialized with biologically derived weights. This underlies the potential for biologically inspired computational models to further our understanding of vision processing systems and to improve the efficacy of convolutional networks.

Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields

TL;DR

The paper investigates whether depthwise convolutional filters in CNNs naturally replicate biological center-surround receptive fields and whether this structure can be leveraged to improve learning. It analyzes trained kernels across multiple SOTA models, formalizes a center-surround model via Difference-of-Gaussians (), and introduces a -based initialization for depthwise layers that alternates excitatory and inhibitory centers with varied center-to-surround ratios (). The key contributions include empirical evidence of -like center-surround motifs in depthwise kernels, a practical initialization method that improves ImageNet accuracy (notably for larger kernels), and ablation results showing the importance of () and center type. This work demonstrates that bio-inspired priors can enhance CNN performance and offers a pathway to integrating neuroscience insights into architecture design for vision tasks.

Abstract

In this study, we present evidence suggesting that depthwise convolutional kernels are effectively replicating the structural intricacies of the biological receptive fields observed in the mammalian retina. We provide analytics of trained kernels from various state-of-the-art models substantiating this evidence. Inspired by this intriguing discovery, we propose an initialization scheme that draws inspiration from the biological receptive fields. Experimental analysis of the ImageNet dataset with multiple CNN architectures featuring depthwise convolutions reveals a marked enhancement in the accuracy of the learned model when initialized with biologically derived weights. This underlies the potential for biologically inspired computational models to further our understanding of vision processing systems and to improve the efficacy of convolutional networks.
Paper Structure (12 sections, 3 equations, 11 figures, 2 tables)

This paper contains 12 sections, 3 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: a) Depthwise Convolutional kernels trained on ImageNet dataset, and b) the DoG model of the biological center-surround receptive fields with different center-to-surround ratios, and with excitatory center (right) and inhibitory center (left), respectively. Artificial kernels mimic biological center-surround patterns.
  • Figure 2: A "difference-of-Gaussians" is used to model a neuron's sensitivity to light at various positions on the retina. This model comprises two Gaussian functions - a narrow, positive one, representing the stimulatory center, and a wide, negative one, indicating the suppressive surround, for the neurons with an excitatory center, and the other way around for the ones with an inhibitory center.
  • Figure 3: Size 9 DoG kernels with inhibitory (top) and excitatory (buttom) centers, with different ratios of the center-surround radiuses ($\gamma$).
  • Figure 4: Random samples from depth-wise convolutions of various models with different kernel sizes, trained on the ImageNet dataset. Trained kernels show considerable repeating patterns, many of them featuring a center-surround structure.
  • Figure 5: Kernels randomly selected from each K-Means cluster (top) and their respective cluster averages (bottom). Right clusters resemble excitatory-centered fields and middle clusters resemble inhibitory-centered. Clusters on the left contain all other patterns, resulting in their average being cluttered, implying the dominance in the first two cluster patterns.
  • ...and 6 more figures