Employing Layerwised Unsupervised Learning to Lessen Data and Loss Requirements in Forward-Forward Algorithms
Taewook Hwang, Hyein Seo, Sangkeun Jung
TL;DR
The paper tackles the gap between biological plausibility and backpropagation by extending Forward-Forward learning with Unsupervised Forward-Forward (UFF), which replaces single layers with autonomous unsupervised models (AE, DAE, CAE, GAN) to enable forward-only training using standard inputs and losses. Each cell outputs a latent vector that feeds the next cell, and a final classifier combines all latents for prediction, with local training and a suite of reconstruction/adversarial losses, optimized by AdamW and stabilized by layer normalization. Empirical results on MNIST and CIFAR-10 show that while BP-trained CNNs still lead in accuracy, UFF variants, especially CAEFF, achieve competitive performance and exhibit greater stability than FF under separate training, albeit with longer training times and generally lower peak performance than BP. The findings suggest UFF as a practical alternative in scenarios where BP is difficult to implement, such as federated learning, while also highlighting the need for further hyperparameter tuning and model refinements to close the gap with backpropagation.
Abstract
Recent deep learning models such as ChatGPT utilizing the back-propagation algorithm have exhibited remarkable performance. However, the disparity between the biological brain processes and the back-propagation algorithm has been noted. The Forward-Forward algorithm, which trains deep learning models solely through the forward pass, has emerged to address this. Although the Forward-Forward algorithm cannot replace back-propagation due to limitations such as having to use special input and loss functions, it has the potential to be useful in special situations where back-propagation is difficult to use. To work around this limitation and verify usability, we propose an Unsupervised Forward-Forward algorithm. Using an unsupervised learning model enables training with usual loss functions and inputs without restriction. Through this approach, we lead to stable learning and enable versatile utilization across various datasets and tasks. From a usability perspective, given the characteristics of the Forward-Forward algorithm and the advantages of the proposed method, we anticipate its practical application even in scenarios such as federated learning, where deep learning layers need to be trained separately in physically distributed environments.
