A Taxonomy of Recurrent Learning Rules

Guillermo Martín-Sánchez; Sander Bohté; Sebastian Otte

A Taxonomy of Recurrent Learning Rules

Guillermo Martín-Sánchez, Sander Bohté, Sebastian Otte

TL;DR

This work analyzes the gradient computation for recurrent networks by unifying Backpropagation Through Time (BPTT), Real-Time Recurrent Learning (RTRL), and the online, local learning rule e-prop. It formalizes a common computational-graph framework, derives RTRL from BPTT through re-expression of implicit and explicit recurrences, and then casts e-prop as a causal, local approximation that discards non-causal explicit interactions. A key contribution is the introduction of a family of m-order e-prop, which progressively reintroduces higher-order (more distant) temporal and cross-neuron dependencies to improve gradient accuracy while preserving causality and locality to varying degrees. The framework clarifies the trade-offs between computational cost, memory, and gradient fidelity, and highlights practical pathways for online learning in recurrent architectures including RSNNs and LSTMs.

Abstract

Backpropagation through time (BPTT) is the de facto standard for training recurrent neural networks (RNNs), but it is non-causal and non-local. Real-time recurrent learning is a causal alternative, but it is highly inefficient. Recently, e-prop was proposed as a causal, local, and efficient practical alternative to these algorithms, providing an approximation of the exact gradient by radically pruning the recurrent dependencies carried over time. Here, we derive RTRL from BPTT using a detailed notation bringing intuition and clarification to how they are connected. Furthermore, we frame e-prop within in the picture, formalising what it approximates. Finally, we derive a family of algorithms of which e-prop is a special case.

A Taxonomy of Recurrent Learning Rules

TL;DR

Abstract

Paper Structure (52 sections, 54 equations, 9 figures, 1 table)

This paper contains 52 sections, 54 equations, 9 figures, 1 table.

Introduction
Background
Backpropagation Through Time
Explicit Recurrences:
Implicit Recurrences:
BPTT:
Non-locality:
Non-causality:
Real-Time Recurrent Learning
Re-expressing Implicit Recurrence
Unrolling the recursion:
Flip time indices:
Backwards interpretation:
Forwards interpretation:
Incremental computation:
...and 37 more sections

Figures (9)

Figure 1: Overview of all the algorithms and how they relate to each other.
Figure 2: Simple example of computational graph and distinction between total and partial derivative of $f$ with respect to $x$.
Figure 3: Computational graph for A) explicit recurrences gradients, B) implicit recurrence gradients and C) final computation of BPTT.
Figure 4: Computational graph for the implicit variable $\epsilon^t_{ij}$ with $t'= t-2$.
Figure 5: Computational graph for A) BPTT re-expressed with implicit elegibility trace (cf. Eq. \ref{['eqg2']}) B) symmetric e-prop.
...and 4 more figures

Theorems & Definitions (8)

Definition : Implicit variable
Definition : Implicit eligibility trace
Definition : Explicit variable
Definition : Explicit eligibility trace
Definition : Recurrence variable
Definition : Recurrence eligibility trace
Definition : Read-out implicit variable
Definition : Read-out implicit eligibility trace

A Taxonomy of Recurrent Learning Rules

TL;DR

Abstract

A Taxonomy of Recurrent Learning Rules

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (8)