Table of Contents
Fetching ...

A Lyapunov theory demonstrating a fundamental limit on the speed of systems consolidation

Alireza Alemi, Emre R. F. Aksay, Mark S. Goldman

TL;DR

This work addresses how neural systems can safely consolidate memories across brain regions by deriving a Lyapunov-based stability framework for a two-stage consolidation model. The authors show that the late-stage learning must not exceed the early-stage rate, and that slower late-stage tuning increases robustness to perturbations, with a clear mapping to a damped oscillator where the learning-rate ratio plays the role of the damping. They formalize these insights with a Lyapunov function and prove global stability and asymptotic convergence under a bound α ≤ α_c, and they illustrate the approach on a cerebellum–neural integrator circuit, yielding testable predictions about cerebellar involvement during memory consolidation. The results offer theoretical constraints on learning speed in both biological systems and adaptive engineering, suggesting design principles for stable, online memory storage and robust learning in AI systems.

Abstract

The nervous system reorganizes memories from an early site to a late site, a commonly observed feature of learning and memory systems known as systems consolidation. Previous work has suggested learning rules by which consolidation may occur. Here, we provide conditions under which such rules are guaranteed to lead to stable convergence of learning and consolidation. We use the theory of Lyapunov functions, which enforces stability by requiring learning rules to decrease an energy-like (Lyapunov) function. We present the theory in the context of a simple circuit architecture motivated by classic models of learning in systems consolidation mediated by the cerebellum. Stability is only guaranteed if the learning rate in the late stage is not faster than the learning rate in the early stage. Further, the slower the learning rate at the late stage, the larger the perturbation the system can tolerate with a guarantee of stability. We provide intuition for this result by mapping the consolidation model to a damped driven oscillator system, and showing that the ratio of early- to late-stage learning rates in the consolidation model can be directly identified with the (square of the) oscillator's damping ratio. This work suggests the power of the Lyapunov approach to provide constraints on nervous system function.

A Lyapunov theory demonstrating a fundamental limit on the speed of systems consolidation

TL;DR

This work addresses how neural systems can safely consolidate memories across brain regions by deriving a Lyapunov-based stability framework for a two-stage consolidation model. The authors show that the late-stage learning must not exceed the early-stage rate, and that slower late-stage tuning increases robustness to perturbations, with a clear mapping to a damped oscillator where the learning-rate ratio plays the role of the damping. They formalize these insights with a Lyapunov function and prove global stability and asymptotic convergence under a bound α ≤ α_c, and they illustrate the approach on a cerebellum–neural integrator circuit, yielding testable predictions about cerebellar involvement during memory consolidation. The results offer theoretical constraints on learning speed in both biological systems and adaptive engineering, suggesting design principles for stable, online memory storage and robust learning in AI systems.

Abstract

The nervous system reorganizes memories from an early site to a late site, a commonly observed feature of learning and memory systems known as systems consolidation. Previous work has suggested learning rules by which consolidation may occur. Here, we provide conditions under which such rules are guaranteed to lead to stable convergence of learning and consolidation. We use the theory of Lyapunov functions, which enforces stability by requiring learning rules to decrease an energy-like (Lyapunov) function. We present the theory in the context of a simple circuit architecture motivated by classic models of learning in systems consolidation mediated by the cerebellum. Stability is only guaranteed if the learning rate in the late stage is not faster than the learning rate in the early stage. Further, the slower the learning rate at the late stage, the larger the perturbation the system can tolerate with a guarantee of stability. We provide intuition for this result by mapping the consolidation model to a damped driven oscillator system, and showing that the ratio of early- to late-stage learning rates in the consolidation model can be directly identified with the (square of the) oscillator's damping ratio. This work suggests the power of the Lyapunov approach to provide constraints on nervous system function.
Paper Structure (15 sections, 2 theorems, 16 equations, 4 figures)

This paper contains 15 sections, 2 theorems, 16 equations, 4 figures.

Key Result

Theorem 1

Stability: If, in a ball $\mathbf{B}_{\mathbf{R_{0}}}$ around the equilibrium point $\mathbf{0}$, there exists a scalar function $L(\mathbf{x},t)$ with continuous partial derivatives such that then the equilibrium point $\mathbf{0}$ is stable in the sense of Lyapunov. Uniform stability and uniform asymptotic stability: If, furthermore, then the origin is uniformly stable. If condition 2 is stren

Figures (4)

  • Figure 1: A toy tracking model demonstrating instability in systems consolidation. (A) Top: Single-stage model. The early-stage parameter, $w_{1}$, is directly tuned by the error signal $e(t)$, whereas the late-stage parameter, $w_{2}$, is fixed. (B) Two-stage tuning model. The early stage is as in the single-stage model. The late-stage weight $w_{2}$ is tuned using the output of the early stage as a secondary teaching signal. (C) Simulation of single-stage model showing that the model successfully converges to the desired output $r^*$ and the desired, tuned weight $w_1$ (dashed line) in the case of additive perturbation.(D) In the two-stage model with additive perturbation, when the consolidation process is slow enough (left panel), the model dynamics successfully converge and tune the weight $w_2$ to its desired value (dashed line). However, if the consolidation process becomes too fast (right panels), the system can show instability. (E,F) Similar to (C,D) but for a signal-dependent perturbation for which fast consolidation leads to unbounded grouth. Note the abbreviated time scale of the right panel in (F), chosen for better visibility of the unbounded growth. See Appendix A for simulation details.
  • Figure 2: Lyapunov function theory for stability of the two-stage toy model. (A) In the limit that the perturbation goes to zero, $\mu\rightarrow0$, the closed-loop learning dynamics has a single fixed point and two nullclines (shown for $w^{*}=1$). (B) The Lyapunov function candidate $L$ has two terms: the squared gain error and the squared consolidation error. The most important property in order to have stable convergence in the Lyapunov sense is that the dynamics of the learning rules should avoid going uphill on the Lyapunov function surface. (C) When the ratio of learning rates $\alpha=\eta_{2}/\eta_{1}$ is less than a critical value $\alpha_{c}=1-\mu$, the learning is guaranteed to be stable. As the maximum perturbation amplitude reaches $|e|$, i.e., $\mu=1$, the region of guaranteed stability vanishes.
  • Figure 3: Amplification of perturbation in the region without stability guarantee in the two-stage toy model. (A) The steady-state of $w_{2}$ exhibits an amplification of an infinitesimal sinusoidal perturbation probe $\xi_{\text{p}}=\epsilon\sin(\omega_{n}t)$when $\alpha > 1$, shown for $\omega = \omega_n$ in the limit that $\mu\rightarrow0$. Top, $\alpha=3$ (red box); bottom, $\alpha=0.33$ (green box). (B) The steady-state percent amplification of the sinusoidal probe perturbation as a function of the ratio of the normalized frequency (normalized by the undamped natural frequency $\omega_{n}$) of the probe $\xi_{p}$ in the limit that $\mu\rightarrow0$.
  • Figure 4: Cerebellar tuning and subsequent consolidation of the time constant of the oculomotor neural integrator. (A) The computation of a neural integrator. (B) The fine-tuning problem of neural integrators. Increasing or decreasing the strength of recurrent feedback by a small amount causes exponential growth or decay, respectively, of integrator activity (The shown neuronal time constant $\tau=1$ s is taken from [46], but the results apply more broadly). FR: firing rate. (C) Two-stage model for tuning the time constant of the neural integrator (NI). PC: cerebellar Purkinje cell. CF: climbing fiber carrying the retinal slip error signal. (D) Tuning the network from an initially unstable integrator condition (yellow), for a model with a fast learning rate in the early site of plasticity (cerebellum) and a slow learning rate in the late site (integrator). In the early stage of learning (orange), behavior is tuned primarily through the plasticity of the early site weights $\mathbf{w}_\text{PC}$, so that the integrator function depends on the cerebellum (dashed grey line: effect of cerebellar inactivation). The learned memory then gets consolidated into the recurrent connectivity $\mathbf{\Omega}$ of the neural integrator circuit, making the function independent of the cerebellum (dark magenta). Lower left panel shows strength of feedback in the cerebellar-NI feedback loop (x-axis), and within the NI recurrent network (y-axis, as characterized by the largest eigenvalue of $\mathbf{\Omega}$). (E) Tuning the model from the same initial condition as in (D) but with the learning rates of the early and late sites interchanged, so the early site has the slower learning rate. This can lead to instability in the system, as demonstrated by this example of oscillatory behavior in the weight space, resulting in alternating unstable (left) and leaky (right) eye positions, with a period ranging from 200 to 280 seconds.

Theorems & Definitions (7)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Theorem 1: Lyapunov theorem for non-autonomous systems
  • Lemma 1