Towards Biologically Plausible Deep Learning
Yoshua Bengio, Dong-Hyun Lee, Jorg Bornschein, Thomas Mesnard, Zhouhan Lin
TL;DR
The paper tackles the credit assignment problem in deep networks from a biological perspective, arguing that spike-timing dependent plasticity (STDP) can implement gradient-like updates without backpropagation. It develops a variational EM framework with learned approximate inference, interpreted through denoising auto-encoder lenses, and introduces target propagation as a biologically plausible gradient estimator. By structuring deep generative models with layer-wise updates and exploring both joint and latent-variable denoisers, the approach yields improved sampling and learning dynamics, validated on MNIST. The work lays groundwork for biologically plausible learning across supervised, unsupervised, and reinforcement contexts, while outlining key future hurdles for neural implementation.
Abstract
Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology. We explore more biologically plausible versions of deep representation learning, focusing here mostly on unsupervised learning but developing a learning mechanism that could account for supervised, unsupervised and reinforcement learning. The starting point is that the basic learning rule believed to govern synaptic weight updates (Spike-Timing-Dependent Plasticity) arises out of a simple update rule that makes a lot of sense from a machine learning point of view and can be interpreted as gradient descent on some objective function so long as the neuronal dynamics push firing rates towards better values of the objective function (be it supervised, unsupervised, or reward-driven). The second main idea is that this corresponds to a form of the variational EM algorithm, i.e., with approximate rather than exact posteriors, implemented by neural dynamics. Another contribution of this paper is that the gradients required for updating the hidden states in the above variational interpretation can be estimated using an approximation that only requires propagating activations forward and backward, with pairs of layers learning to form a denoising auto-encoder. Finally, we extend the theory about the probabilistic interpretation of auto-encoders to justify improved sampling schemes based on the generative interpretation of denoising auto-encoders, and we validate all these ideas on generative learning tasks.
