Training Large Neural Networks With Low-Dimensional Error Feedback
Maher Hanut, Jonathan Kadmon
TL;DR
This work shows that training deep networks need not transport full gradient information; a learned, low-dimensional teaching signal can suffice for effective credit assignment when projected into the task-relevant subspace. The authors develop low-dimensional feedback alignment (LDFA), combining a rank-$r$ backward pathway $B=QP$ with either normative or local subspace learning rules, and demonstrate near-backpropagation performance across linear models, CNNs, and vision transformers on CIFAR-10/100. They show that error dimensionality primarily tracks task dimensionality $d$, enabling substantial backward-pass compute savings while preserving accuracy, and that the dimensionality of the error channel shapes early representations, offering biological plausibility and new inductive biases for learning systems. The results imply a principled rethinking of gradient-based learning in high-dimensional systems and point toward practical, brain-inspired approaches for efficient training and representation learning.
Abstract
Training deep neural networks typically relies on backpropagating high dimensional error signals a computationally intensive process with little evidence supporting its implementation in the brain. However, since most tasks involve low-dimensional outputs, we propose that low-dimensional error signals may suffice for effective learning. To test this hypothesis, we introduce a novel local learning rule based on Feedback Alignment that leverages indirect, low-dimensional error feedback to train large networks. Our method decouples the backward pass from the forward pass, enabling precise control over error signal dimensionality while maintaining high-dimensional representations. We begin with a detailed theoretical derivation for linear networks, which forms the foundation of our learning framework, and extend our approach to nonlinear, convolutional, and transformer architectures. Remarkably, we demonstrate that even minimal error dimensionality on the order of the task dimensionality can achieve performance matching that of traditional backpropagation. Furthermore, our rule enables efficient training of convolutional networks, which have previously been resistant to Feedback Alignment methods, with minimal error. This breakthrough not only paves the way toward more biologically accurate models of learning but also challenges the conventional reliance on high-dimensional gradient signals in neural network training. Our findings suggest that low-dimensional error signals can be as effective as high-dimensional ones, prompting a reevaluation of gradient-based learning in high-dimensional systems. Ultimately, our work offers a fresh perspective on neural network optimization and contributes to understanding learning mechanisms in both artificial and biological systems.
