The Forward-Forward Algorithm: Characterizing Training Behavior
Reece Adamson
TL;DR
This work analyzes the training dynamics of Forward-Forward networks, an alternative to backpropagation that uses two forward passes and layer-local goodness objectives. By varying network depth, width, and training epochs on MNIST, it demonstrates that deeper layers exhibit delayed accuracy gains while shallower layers' performance correlates more strongly with overall accuracy. The study introduces and tests hypotheses about depth-related delays and layer-to-global accuracy correlations, finding that layer depth conditions these relationships and that correlations weaken with depth or increased layer width. These insights contribute to a mechanistic understanding of Forward-Forward behavior and suggest practical directions for layer-wise network design and iterative construction, though broader validation across datasets is needed.
Abstract
The Forward-Forward algorithm is an alternative learning method which consists of two forward passes rather than a forward and backward pass employed by backpropagation. Forward-Forward networks employ layer local loss functions which are optimized based on the layer activation for each forward pass rather than a single global objective function. This work explores the dynamics of model and layer accuracy changes in Forward-Forward networks as training progresses in pursuit of a mechanistic understanding of their internal behavior. Treatments to various system characteristics are applied to investigate changes in layer and overall model accuracy as training progresses, how accuracy is impacted by layer depth, and how strongly individual layer accuracy is correlated with overall model accuracy. The empirical results presented suggest that layers deeper within Forward-Forward networks experience a delay in accuracy improvement relative to shallower layers and that shallower layer accuracy is strongly correlated with overall model accuracy.
