Context selectivity with dynamic availability enables lifelong continual learning
Martin Barry, Wulfram Gerstner, Guillaume Bellec
TL;DR
This work introduces GateON, a simple yet powerful meta-plasticity framework for lifelong continual learning that combines gated context selectivity with a dynamic availability mechanism to regulate plasticity across tasks. It provides both a normative parametric theory (p-GateON) and a bio-plausible neuro-centric instantiation (n-GateON), unifying context gating and task-specific consolidation without replay. Empirical results across MNIST variants, CIFAR-100, and NLP benchmarks (including BERT-based settings) show strong forward transfer and reduced forgetting, outperforming several replay-free baselines and remaining effective as task counts scale. The paper also offers experimental neuroscience predictions, arguing that neuronal availability signals and metaplasticity-like dynamics could underlie lifelong learning in the brain, with practical implications for designing robust CL systems in AI. Overall, GateON presents a parsimonious, testable mechanism that balances forgetting and consolidation, enabling transfer across modalities and suggesting concrete paths for neuroscience-informed CL research.
Abstract
"You never forget how to ride a bike", -- but how is that possible? The brain is able to learn complex skills, stop the practice for years, learn other skills in between, and still retrieve the original knowledge when necessary. The mechanisms of this capability, referred to as lifelong learning (or continual learning, CL), are unknown. We suggest a bio-plausible meta-plasticity rule building on classical work in CL which we summarize in two principles: (i) neurons are context selective, and (ii) a local availability variable partially freezes the plasticity if the neuron was relevant for previous tasks. In a new neuro-centric formalization of these principles, we suggest that neuron selectivity and neuron-wide consolidation is a simple and viable meta-plasticity hypothesis to enable CL in the brain. In simulation, this simple model balances forgetting and consolidation leading to better transfer learning than contemporary CL algorithms on image recognition and natural language processing CL benchmarks.
