Accelerated Predictive Coding Networks via Direct Kolen-Pollack Feedback Alignment
Davide Casnici, Martin Lefebvre, Justin Dauwels, Charlotte Frenkel
TL;DR
This work tackles the biological plausibility and hardware efficiency concerns of backpropagation by advancing predictive coding (PC) with Direct Kolen-Pollack Predictive Coding (DKP-PC). By introducing learnable direct feedback from the output layer to all hidden layers, it removes both error-delay and exponential decay that plague PC, reducing backward-time complexity from $O(L)$ to $O(1)$ while preserving locality. DKP-PC combines direct feedback alignment and KP-inspired learning, enabling a single, effective inference step to achieve training performance that matches or exceeds standard PC and rivals BP on a range of networks and datasets, including VGG-like CNNs on Tiny ImageNet. The approach also demonstrates substantial gains in training speed and energy efficiency, highlighting its potential for neuromorphic hardware and on-chip learning, with opportunities for further optimization via custom kernels and feedback-weight sparsity/quantization.
Abstract
Predictive coding (PC) is a biologically inspired algorithm for training neural networks that relies only on local updates, allowing parallel learning across layers. However, practical implementations face two key limitations: error signals must still propagate from the output to early layers through multiple inference-phase steps, and feedback decays exponentially during this process, leading to vanishing updates in early layers. We propose direct Kolen-Pollack predictive coding (DKP-PC), which simultaneously addresses both feedback delay and exponential decay, yielding a more efficient and scalable variant of PC while preserving update locality. Leveraging direct feedback alignment and direct Kolen-Pollack algorithms, DKP-PC introduces learnable feedback connections from the output layer to all hidden layers, establishing a direct pathway for error transmission. This yields an algorithm that reduces the theoretical error propagation time complexity from O(L), with L being the network depth, to O(1), removing depth-dependent delay in error signals. Moreover, empirical results demonstrate that DKP-PC achieves performance at least comparable to, and often exceeding, that of standard PC, while offering improved latency and computational performance, supporting its potential for custom hardware-efficient implementations.
