Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields
Zahra Babaiee, Peyman M. Kiasari, Daniela Rus, Radu Grosu
TL;DR
The paper investigates whether depthwise convolutional filters in CNNs naturally replicate biological center-surround receptive fields and whether this structure can be leveraged to improve learning. It analyzes trained kernels across multiple SOTA models, formalizes a center-surround model via Difference-of-Gaussians ($DoG$), and introduces a $DoG$-based initialization for depthwise layers that alternates excitatory and inhibitory centers with varied center-to-surround ratios ($\gamma$). The key contributions include empirical evidence of $DoG$-like center-surround motifs in depthwise kernels, a practical initialization method that improves ImageNet accuracy (notably for larger kernels), and ablation results showing the importance of ($\gamma$) and center type. This work demonstrates that bio-inspired priors can enhance CNN performance and offers a pathway to integrating neuroscience insights into architecture design for vision tasks.
Abstract
In this study, we present evidence suggesting that depthwise convolutional kernels are effectively replicating the structural intricacies of the biological receptive fields observed in the mammalian retina. We provide analytics of trained kernels from various state-of-the-art models substantiating this evidence. Inspired by this intriguing discovery, we propose an initialization scheme that draws inspiration from the biological receptive fields. Experimental analysis of the ImageNet dataset with multiple CNN architectures featuring depthwise convolutions reveals a marked enhancement in the accuracy of the learned model when initialized with biologically derived weights. This underlies the potential for biologically inspired computational models to further our understanding of vision processing systems and to improve the efficacy of convolutional networks.
