Table of Contents
Fetching ...

Scaling Equilibrium Propagation to Deeper Neural Network Architectures

Sankar Vinayak Elayedam, Gopalakrishnan Srinivasan

TL;DR

This work tackles the depth and accuracy gap of equilibrium propagation (EP) by introducing Hopfield-Resnet, a residual Hopfield network with a bounded activation that enables training deeper architectures via EP. It combines residual connections and a ReLU$\alpha$ activation (notably ReLU6) within a convolutional Hopfield framework, preserving the EP energy function while scaling to more than 12 layers. With centered equilibrium propagation (CEP) to reduce gradient bias, Hopfield-Resnet13 achieves $93.92\%$ accuracy on CIFAR-10, closely rivaling ResNet13 trained with backpropagation, across CIFAR-10/100 and Fashion-MNIST datasets. The results suggest EP can approach BP performance on deeper networks and highlight the need for specialized hardware and algorithmic optimizations to realize practical on-device learning advantages of EP-enabled architectures.

Abstract

Equilibrium propagation has been proposed as a biologically plausible alternative to the backpropagation algorithm. The local nature of gradient computations, combined with the use of convergent RNNs to reach equilibrium states, make this approach well-suited for implementation on neuromorphic hardware. However, previous studies on equilibrium propagation have been restricted to networks containing only dense layers or relatively small architectures with a few convolutional layers followed by a final dense layer. These networks have a significant gap in accuracy compared to similarly sized feedforward networks trained with backpropagation. In this work, we introduce the Hopfield-Resnet architecture, which incorporates residual (or skip) connections in Hopfield networks with clipped $\mathrm{ReLU}$ as the activation function. The proposed architectural enhancements enable the training of networks with nearly twice the number of layers reported in prior works. For example, Hopfield-Resnet13 achieves 93.92\% accuracy on CIFAR-10, which is $\approx$3.5\% higher than the previous best result and comparable to that provided by Resnet13 trained using backpropagation.

Scaling Equilibrium Propagation to Deeper Neural Network Architectures

TL;DR

This work tackles the depth and accuracy gap of equilibrium propagation (EP) by introducing Hopfield-Resnet, a residual Hopfield network with a bounded activation that enables training deeper architectures via EP. It combines residual connections and a ReLU activation (notably ReLU6) within a convolutional Hopfield framework, preserving the EP energy function while scaling to more than 12 layers. With centered equilibrium propagation (CEP) to reduce gradient bias, Hopfield-Resnet13 achieves accuracy on CIFAR-10, closely rivaling ResNet13 trained with backpropagation, across CIFAR-10/100 and Fashion-MNIST datasets. The results suggest EP can approach BP performance on deeper networks and highlight the need for specialized hardware and algorithmic optimizations to realize practical on-device learning advantages of EP-enabled architectures.

Abstract

Equilibrium propagation has been proposed as a biologically plausible alternative to the backpropagation algorithm. The local nature of gradient computations, combined with the use of convergent RNNs to reach equilibrium states, make this approach well-suited for implementation on neuromorphic hardware. However, previous studies on equilibrium propagation have been restricted to networks containing only dense layers or relatively small architectures with a few convolutional layers followed by a final dense layer. These networks have a significant gap in accuracy compared to similarly sized feedforward networks trained with backpropagation. In this work, we introduce the Hopfield-Resnet architecture, which incorporates residual (or skip) connections in Hopfield networks with clipped as the activation function. The proposed architectural enhancements enable the training of networks with nearly twice the number of layers reported in prior works. For example, Hopfield-Resnet13 achieves 93.92\% accuracy on CIFAR-10, which is 3.5\% higher than the previous best result and comparable to that provided by Resnet13 trained using backpropagation.

Paper Structure

This paper contains 16 sections, 9 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Architecture of Hopfield-Resnet13, consisting of four Hopfield-Resnet blocks, each containing three convolutional layers (two of them in the main pathway and one skip connection), followed by a dense output layer.
  • Figure 2: Test loss for the Hopfield-Resnet13 architecture trained using centered equilibrium propagation (CEP), with and without the skip connections, on the CIFAR-10 test set.
  • Figure 3: Test error over epochs for VGG5 and Hopfield-Resnet architectures, shown for different combinations of activation function and data augmentation on the CIFAR-10 test set. Note that the $y$-axis is displayed in $log$ scale.
  • Figure 4: Layer wise distribution of weight values in Resnet13 trained with backpropagation and Hopfield-Resnet13 trained with centered equilibrium propagation on the CIFAR-10 dataset.