Finding Structure in Continual Learning
Pourya Shamsolmoali, Masoumeh Zareapoor
TL;DR
This work tackles the stability-plasticity trade-off in continual learning by reframing optimization through Douglas-Rachford Splitting (DRS), decoupling plasticity (task-fitting) from stability (prior alignment). It employs a Bayesian latent space with posterior-to-prior propagation and a Rényi-divergence penalty to guide learning without replay buffers, proving convergence to stationary points of the composite objective $F=f+g$. The proposed algorithm alternates proximal steps for the task-fitting and prior-alignment terms and uses a relaxed update to fuse their outputs, ensuring interference between updates diminishes over time. Empirically, the method surpasses state-of-the-art baselines on diverse benchmarks, achieving high accuracy, low forgetting on disjoint tasks, and strong forward transfer on joint tasks, all with replay-free operation.
Abstract
Learning from a stream of tasks usually pits plasticity against stability: acquiring new knowledge often causes catastrophic forgetting of past information. Most methods address this by summing competing loss terms, creating gradient conflicts that are managed with complex and often inefficient strategies such as external memory replay or parameter regularization. We propose a reformulation of the continual learning objective using Douglas-Rachford Splitting (DRS). This reframes the learning process not as a direct trade-off, but as a negotiation between two decoupled objectives: one promoting plasticity for new tasks and the other enforcing stability of old knowledge. By iteratively finding a consensus through their proximal operators, DRS provides a more principled and stable learning dynamic. Our approach achieves an efficient balance between stability and plasticity without the need for auxiliary modules or complex add-ons, providing a simpler yet more powerful paradigm for continual learning systems.
