Mitigating Communication Costs in Neural Networks: The Role of Dendritic Nonlinearity
Xundong Wu, Pengfei Zhao, Zilin Yu, Lei Ma, Ka-Wa Yip, Huajin Tang, Gang Pan, Poirazi Panayiota, Tiejun Huang
TL;DR
The paper investigates whether nonlinear dendritic processing can reduce communication costs in artificial neural networks without substantially harming learning capacity. It introduces a dendritic neuron model with $K$ branches and compares it to point neurons across dense and sparse regimes, maintaining comparable compute by setting $\hat{D}=D/\sqrt{K}$ and using budget ratio $\Psi$. The key finding is that dendritic nonlinearities provide limited gains in learning capacity but substantially lower inter-neuronal communication and memory access, with empirical scaling $\hat{C}_E \propto K^{-0.51}$ under fixed budgets. These results inform the design of energy-efficient neural accelerators and memory systems for training and inference.
Abstract
Our understanding of biological neuronal networks has profoundly influenced the development of artificial neural networks (ANNs). However, neurons utilized in ANNs differ considerably from their biological counterparts, primarily due to the absence of complex dendritic trees with local nonlinearities. Early studies have suggested that dendritic nonlinearities could substantially improve the learning capabilities of neural network models. In this study, we systematically examined the role of nonlinear dendrites within neural networks. Utilizing machine-learning methodologies, we assessed how dendritic nonlinearities influence neural network performance. Our findings demonstrate that dendritic nonlinearities do not substantially affect learning capacity; rather, their primary benefit lies in enabling network capacity expansion while minimizing communication costs through effective localized feature aggregation. This research provides critical insights with significant implications for designing future neural network accelerators aimed at reducing communication overhead during neural network training and inference.
