Towards the Training of Deeper Predictive Coding Neural Networks

Chang Qi; Matteo Forasassi; Thomas Lukasiewicz; Tommaso Salvatori

Towards the Training of Deeper Predictive Coding Neural Networks

Chang Qi, Matteo Forasassi, Thomas Lukasiewicz, Tommaso Salvatori

TL;DR

This paper tackles the poor scalability of predictive coding networks to deep architectures by diagnosing energy propagation as the root cause of degraded learning and introducing precision-based mechanisms to balance layer-wise errors. The authors propose time-dependent precision schedules (notably Spiking Precision), a forward-update weight rule, residual-connection buffering with auxiliary neurons, and BatchNorm Freezing to stabilize iterative inference. Together, these algorithmic and architectural changes enable PC and Incremental PC to achieve performance on par with backpropagation on deep networks such as VGG up to 15 layers and ResNet18 on Tiny ImageNet, highlighting potential for energy-efficient, local learning in deep models. The work demonstrates that carefully modulating energy flow and update dynamics can close the gap between PC methods and standard backpropagation in real-world deep learning tasks.

Abstract

Predictive coding networks are neural models that perform inference through an iterative energy minimization process, whose operations are local in space and time. While effective in shallow architectures, they suffer significant performance degradation beyond five to seven layers. In this work, we show that this degradation is caused by exponentially imbalanced errors between layers during weight updates, and by predictions from the previous layers not being effective in guiding updates in deeper layers. Furthermore, when training models with skip connections, the energy propagated by the residuals reaches higher layers faster than that propagated by the main pathway, affecting test accuracy. We address the first issue by introducing a novel precision-weighted optimization of latent variables that balances error distributions during the relaxation phase, the second issue by proposing a novel weight update mechanism that reduces error accumulation in deeper layers, and the third one by using auxiliary neurons that slow down the propagation of the energy in the residual connections. Empirically, our methods achieve performance comparable to backpropagation on deep models such as ResNets, opening new possibilities for predictive coding in complex tasks.

Towards the Training of Deeper Predictive Coding Neural Networks

TL;DR

Abstract

Towards the Training of Deeper Predictive Coding Neural Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)