Can We Understand Plasticity Through Neural Collapse?

Guglielmo Bonifazi; Iason Chalas; Gian Hess; Jakub Łucki

Can We Understand Plasticity Through Neural Collapse?

Guglielmo Bonifazi, Iason Chalas, Gian Hess, Jakub Łucki

TL;DR

The paper investigates whether neural collapse (NC) and plasticity loss (PL) co-occur in deep networks under non-stationary objectives. It quantifies NC, especially the NC1 metric defined as $NC1 = Tr(\Sigma_W \Sigma_B^\dag / C)$, across two continual-learning setups: Permuted MNIST with an MLP and warm-starting CIFAR-10 with a ResNet-18. The results show a context-dependent relationship: in Permuted MNIST, NC1 strengthens as tasks accumulate but is strongly negatively correlated with PL (r = -0.94), while in warm-starting, an early NC–PL correlation exists that wanes over time; importantly, NC1 regularization during warm-up can improve both warm-up and full-dataset accuracies. The findings suggest that NC can contribute to PL in some regimes and that targeted NC-based regularization offers a practical way to mitigate PL, informing strategies for continual learning under changing objectives.

Abstract

This paper explores the connection between two recently identified phenomena in deep learning: plasticity loss and neural collapse. We analyze their correlation in different scenarios, revealing a significant association during the initial training phase on the first task. Additionally, we introduce a regularization approach to mitigate neural collapse, demonstrating its effectiveness in alleviating plasticity loss in this specific setting.

Can We Understand Plasticity Through Neural Collapse?

TL;DR

The paper investigates whether neural collapse (NC) and plasticity loss (PL) co-occur in deep networks under non-stationary objectives. It quantifies NC, especially the NC1 metric defined as

, across two continual-learning setups: Permuted MNIST with an MLP and warm-starting CIFAR-10 with a ResNet-18. The results show a context-dependent relationship: in Permuted MNIST, NC1 strengthens as tasks accumulate but is strongly negatively correlated with PL (r = -0.94), while in warm-starting, an early NC–PL correlation exists that wanes over time; importantly, NC1 regularization during warm-up can improve both warm-up and full-dataset accuracies. The findings suggest that NC can contribute to PL in some regimes and that targeted NC-based regularization offers a practical way to mitigate PL, informing strategies for continual learning under changing objectives.

Abstract

Paper Structure (12 sections, 6 figures)

This paper contains 12 sections, 6 figures.

Introduction
Models and Methods
Permuted MNIST
Warm starting
Results
Permuted MNIST
Warm starting
Discussion
Summary
Warm Starting
Evolution of Neural collapse metrics and Plasticity Loss over the number of warm-up epochs
Hyper-parameters for Warm Starting.

Figures (6)

Figure 2: NC1 and plasticity loss for different initial task training epochs [1,2,5,10,50,100]. Shaded regions are obtained adding and subtracting the standard deviation to the mean.
Figure 3: Correlation between NC1 and test accuracy on Full dataset with a window size of 100. Shaded regions are obtained adding and subtracting the standard deviation to the mean.
Figure 4: Test accuracies on Warm-Up dataset (Right) and Full dataset (Left) for different warm up schemes. Shaded regions are obtained adding and subtracting the standard deviation to the mean.
Figure 5: NC metrics against test accuracy on Full dataset across all warm-up epochs.
Figure 6: NC metrics against test accuracy on Full dataset across the first 100 warm-up epochs. We can observe a clear correlation with all of them.
...and 1 more figures

Can We Understand Plasticity Through Neural Collapse?

TL;DR

Abstract

Can We Understand Plasticity Through Neural Collapse?

Authors

TL;DR

Abstract

Table of Contents

Figures (6)