Scaling Equilibrium Propagation to Deeper Neural Network Architectures
Sankar Vinayak Elayedam, Gopalakrishnan Srinivasan
TL;DR
This work tackles the depth and accuracy gap of equilibrium propagation (EP) by introducing Hopfield-Resnet, a residual Hopfield network with a bounded activation that enables training deeper architectures via EP. It combines residual connections and a ReLU$\alpha$ activation (notably ReLU6) within a convolutional Hopfield framework, preserving the EP energy function while scaling to more than 12 layers. With centered equilibrium propagation (CEP) to reduce gradient bias, Hopfield-Resnet13 achieves $93.92\%$ accuracy on CIFAR-10, closely rivaling ResNet13 trained with backpropagation, across CIFAR-10/100 and Fashion-MNIST datasets. The results suggest EP can approach BP performance on deeper networks and highlight the need for specialized hardware and algorithmic optimizations to realize practical on-device learning advantages of EP-enabled architectures.
Abstract
Equilibrium propagation has been proposed as a biologically plausible alternative to the backpropagation algorithm. The local nature of gradient computations, combined with the use of convergent RNNs to reach equilibrium states, make this approach well-suited for implementation on neuromorphic hardware. However, previous studies on equilibrium propagation have been restricted to networks containing only dense layers or relatively small architectures with a few convolutional layers followed by a final dense layer. These networks have a significant gap in accuracy compared to similarly sized feedforward networks trained with backpropagation. In this work, we introduce the Hopfield-Resnet architecture, which incorporates residual (or skip) connections in Hopfield networks with clipped $\mathrm{ReLU}$ as the activation function. The proposed architectural enhancements enable the training of networks with nearly twice the number of layers reported in prior works. For example, Hopfield-Resnet13 achieves 93.92\% accuracy on CIFAR-10, which is $\approx$3.5\% higher than the previous best result and comparable to that provided by Resnet13 trained using backpropagation.
