Deep Rewiring: Training very sparse deep networks
Guillaume Bellec, David Kappel, Wolfgang Maass, Robert Legenstein
TL;DR
Deep Rewiring addresses training deep networks under strict connectivity limits by jointly learning weights and sparse architectures via stochastic rewiring. It frames rewiring as sampling from a tempered posterior over both parameters and connectivity, enforcing a hard bound on active connections and enabling online adaptation to task demands. Empirically, DEEP R achieves competitive performance at very high sparsity across feedforward, convolutional, and recurrent models, often outperforming pruning-based methods at similar sparsities and enabling transfer-like learning. Theoretical results establish convergence to a stationary constrained posterior, providing a rigorous foundation for sparse online learning with potential hardware benefits. Overall, DEEP R offers a principled, brain-inspired approach to efficient, on-chip training and deployment of sparse deep networks.
Abstract
Neuromorphic hardware tends to pose limits on the connectivity of deep networks that one can run on them. But also generic hardware and software implementations of deep learning run more efficiently for sparse networks. Several methods exist for pruning connections of a neural network after it was trained without connectivity constraints. We present an algorithm, DEEP R, that enables us to train directly a sparsely connected neural network. DEEP R automatically rewires the network during supervised training so that connections are there where they are most needed for the task, while its total number is all the time strictly bounded. We demonstrate that DEEP R can be used to train very sparse feedforward and recurrent neural networks on standard benchmark tasks with just a minor loss in performance. DEEP R is based on a rigorous theoretical foundation that views rewiring as stochastic sampling of network configurations from a posterior.
