Table of Contents
Fetching ...

Branch-Tuning: Balancing Stability and Plasticity for Continual Self-Supervised Learning

Wenzhuo Liu, Fei Zhu, Cheng-Lin Liu

TL;DR

This article employs centered kernel alignment (CKA) for quantitatively analyzing model stability and plasticity, revealing the critical roles of batch normalization (BN) layers for stability and convolutional layers for plasticity.

Abstract

Self-supervised learning (SSL) has emerged as an effective paradigm for deriving general representations from vast amounts of unlabeled data. However, as real-world applications continually integrate new content, the high computational and resource demands of SSL necessitate continual learning rather than complete retraining. This poses a challenge in striking a balance between stability and plasticity when adapting to new information. In this paper, we employ Centered Kernel Alignment for quantitatively analyzing model stability and plasticity, revealing the critical roles of batch normalization layers for stability and convolutional layers for plasticity. Motivated by this, we propose Branch-tuning, an efficient and straightforward method that achieves a balance between stability and plasticity in continual SSL. Branch-tuning consists of branch expansion and compression, and can be easily applied to various SSL methods without the need of modifying the original methods, retaining old data or models. We validate our method through incremental experiments on various benchmark datasets, demonstrating its effectiveness and practical value in real-world scenarios. We hope our work offers new insights for future continual self-supervised learning research. The code will be made publicly available.

Branch-Tuning: Balancing Stability and Plasticity for Continual Self-Supervised Learning

TL;DR

This article employs centered kernel alignment (CKA) for quantitatively analyzing model stability and plasticity, revealing the critical roles of batch normalization (BN) layers for stability and convolutional layers for plasticity.

Abstract

Self-supervised learning (SSL) has emerged as an effective paradigm for deriving general representations from vast amounts of unlabeled data. However, as real-world applications continually integrate new content, the high computational and resource demands of SSL necessitate continual learning rather than complete retraining. This poses a challenge in striking a balance between stability and plasticity when adapting to new information. In this paper, we employ Centered Kernel Alignment for quantitatively analyzing model stability and plasticity, revealing the critical roles of batch normalization layers for stability and convolutional layers for plasticity. Motivated by this, we propose Branch-tuning, an efficient and straightforward method that achieves a balance between stability and plasticity in continual SSL. Branch-tuning consists of branch expansion and compression, and can be easily applied to various SSL methods without the need of modifying the original methods, retaining old data or models. We validate our method through incremental experiments on various benchmark datasets, demonstrating its effectiveness and practical value in real-world scenarios. We hope our work offers new insights for future continual self-supervised learning research. The code will be made publicly available.
Paper Structure (22 sections, 10 equations, 9 figures, 10 tables, 2 algorithms)

This paper contains 22 sections, 10 equations, 9 figures, 10 tables, 2 algorithms.

Figures (9)

  • Figure 1: The difference between (a) Conventional SSL and (b) Continual SSL. In real-world scenarios, non-IID and infinite data emerge continuously. Continual SSL leverages this data flow to train self-supervised models.
  • Figure 2: Layer-wised stability and plasticity curves for Fixed, Fine-tuning, and Branch-tuning using 1x1, 1x3, and 3x3 structures. Branch-tuning achieves the best balance, as Fixed models exhibit low plasticity and Fine-tuning models have limited stability.
  • Figure 3: Difference between (a) a fixed model, (b) Fine-tuning, (c), (d), and (e) our method with 1x1, 1x3, and 3x3 branches.
  • Figure 4: Overview of Branch-tuning, a method to train SSL models in continuous streams of unlabeled data, achieving a balance between stability and plasticity. Our method comprises Branch-Expansion and Branch-Compression and can be applied to various SSL models without modifications. Model performance is assessed using Linear-Probe evaluation with labeled data.
  • Figure 5: Illustration of Branch Compression with three branch structures, 1x1, 1x3, and 3x3.
  • ...and 4 more figures