Do Neural Networks Lose Plasticity in a Gradually Changing World?

Tianhui Liu; Lili Mou

Do Neural Networks Lose Plasticity in a Gradually Changing World?

Tianhui Liu, Lili Mou

TL;DR

This work argues that loss of plasticity in continual learning largely stems from abrupt task shifts rather than intrinsic limitations of neural networks. It introduces a Gradually Changing Environment using input/output interpolation and task sampling to simulate smooth distribution shifts, backed by theoretical analysis under standard smoothness and local convexity assumptions. Empirically, the approach preserves trainability and generalization across vision benchmarks and language tasks, often matching or surpassing traditional abrupt-change mitigations. The findings offer a realistic, robust framework for real-world continual learning with reduced need for extensive hyperparameter tuning and complex regularization strategies.

Abstract

Continual learning has become a trending topic in machine learning. Recent studies have discovered an interesting phenomenon called loss of plasticity, referring to neural networks gradually losing the ability to learn new tasks. However, existing plasticity research largely relies on contrived settings with abrupt task transitions, which often do not reflect real-world environments. In this paper, we propose to investigate a gradually changing environment, and we simulate this by input/output interpolation and task sampling. We perform theoretical and empirical analysis, showing that the loss of plasticity is an artifact of abrupt tasks changes in the environment and can be largely mitigated if the world changes gradually.

Do Neural Networks Lose Plasticity in a Gradually Changing World?

TL;DR

Abstract

Paper Structure (16 sections, 6 theorems, 19 equations, 6 figures)

This paper contains 16 sections, 6 theorems, 19 equations, 6 figures.

Introduction
Related Work
Problem Formulation
A Gradually Changing Environment
Experiments
Evaluation of Trainability
Evaluation of Generalizability
In-Depth Analyses
Conclusion
Proofs of Lemmas and Theorem
Proof of Lemma \ref{['lemma:gd_converge']}
Proof of Lemma \ref{['lemma:linear_smooth_convex']}
Proof of Lemma \ref{['lemma:ball_in_new_bowl']}
Proof of Lemma \ref{['lemma:converge_to_shifted_minimizer']}
Proof of Theorem \ref{['theorem']}
...and 1 more sections

Key Result

Lemma 4.3

Consider gradient descent (GD) starting from any point in an $(r,\mu)$-locally strongly convex domain ${\mathbb{D}}_{{\bm{x}}_f^*}$ of a $\beta$-smooth function $f$, for some $\beta\ge\mu>0$. Let $({\bm{x}}_k)_{ k=1}^N$ be a sequence generated by GD. If the step size satisfies $\eta \le \min(\frac{1

Figures (6)

Figure 1: Trainability for Random Image Labeling tasks on MNIST and CIFAR10 using an MLP or a Resnet-18 model. Output interpolation is more effective than other plasticity mitigation methods for these vision benchmarks.
Figure 2: Trainability for random Seq2Seq task on synthetic text using T5-small. Task sampling effectively mitigates loss of trainability.
Figure 3: Continual learning with Random Pixel Permuting tasks on EMNIST using a 4-layer MLP model. Generalizability is well preserved in a gradually changing environment.
Figure 4: Generalizability evaluated by test BLEU2 score on Bigram Cipher tasks on customized T5-small model. The gradually changing environment is effective in maintaining test BLEU2 score on new tasks.
Figure 5: The effect of granularity of the interpolation step size on plasticity preseivation for both trainability and generalizability task. A smaller step size simulates gradually changing environment better and retains more plasticity.
...and 1 more figures

Theorems & Definitions (19)

Definition 4.1: Smoothness
Definition 4.2: Locally Strongly Convex
Lemma 4.3
proof
Lemma 4.4
proof
Lemma 4.5
proof
Lemma 4.6
proof
...and 9 more

Do Neural Networks Lose Plasticity in a Gradually Changing World?

TL;DR

Abstract

Do Neural Networks Lose Plasticity in a Gradually Changing World?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (19)