The Forward-Forward Algorithm: Some Preliminary Investigations

Geoffrey Hinton

The Forward-Forward Algorithm: Some Preliminary Investigations

Geoffrey Hinton

TL;DR

The paper introduces the Forward-Forward algorithm (FF), a two-forward-pass learning paradigm that replaces backpropagation’s forward-backward scheme with positive and negative data phases and local layer-wise goodness functions. FF addresses biological plausibility and online learning concerns, enabling derivative-free updates and compatible operation with streaming data, while maintaining competitive performance on MNIST and CIFAR-10 with relatively small networks. It situates FF within the landscape of contrastive learning, linking it to Boltzmann Machines, GANs, and recent self-supervised methods, and discusses hardware implications including analog implementations and the concept of mortal computation. The work presents empirical results, architectural insights (e.g., layer normalization to prevent shortcut solutions), and a roadmap of open questions and future directions for scaling, activations, and generative capabilities.

Abstract

The aim of this paper is to introduce a new learning procedure for neural networks and to demonstrate that it works well enough on a few small problems to be worth further investigation. The Forward-Forward algorithm replaces the forward and backward passes of backpropagation by two forward passes, one with positive (i.e. real) data and the other with negative data which could be generated by the network itself. Each layer has its own objective function which is simply to have high goodness for positive data and low goodness for negative data. The sum of the squared activities in a layer can be used as the goodness but there are many other possibilities, including minus the sum of the squared activities. If the positive and negative passes could be separated in time, the negative passes could be done offline, which would make the learning much simpler in the positive pass and allow video to be pipelined through the network without ever storing activities or stopping to propagate derivatives.

The Forward-Forward Algorithm: Some Preliminary Investigations

TL;DR

Abstract

Paper Structure (21 sections, 4 equations, 3 figures, 1 table)

This paper contains 21 sections, 4 equations, 3 figures, 1 table.

What is wrong with backpropagation
The Forward-Forward Algorithm
Learning multiple layers of representation with a simple layer-wise goodness function
Some experiments with FF
The backpropagation baseline
A simple unsupervised example of FF
A simple supervised example of FF
Using FF to model top-down effects in perception
Using predictions from the spatial context as a teacher
Experiments with CIFAR-10
Sleep
How FF relates to other contrastive learning techniques
Relationship to Boltzmann Machines
Relationship to Generative Adversarial Networks
Relationship to contrastive methods that compare representations of two different image crops
...and 6 more sections

Figures (3)

Figure 1: A hybrid image used as negative data
Figure 2: The receptive fields of 100 neurons in the first hidden layer of the network trained on jittered MNIST. The class labels are represented in the first 10 pixels of each image.
Figure 3: The recurrent network used to process video.

The Forward-Forward Algorithm: Some Preliminary Investigations

TL;DR

Abstract

The Forward-Forward Algorithm: Some Preliminary Investigations

Authors

TL;DR

Abstract

Table of Contents

Figures (3)