Table of Contents
Fetching ...

First Session Adaptation: A Strong Replay-Free Baseline for Class-Incremental Learning

Aristeidis Panos, Yuriko Kobe, Daniel Olmeda Reino, Rahaf Aljundi, Richard E. Turner

TL;DR

This work addresses class-incremental learning under replay-free constraints by introducing First Session Adaptation (FSA), a baseline that adapts the backbone only in the first session and uses an incrementally updated LDA head to perform continual updates without memorizing past data. A FiLM-enabled variant (FSA-FiLM) further enhances few-shot performance by fine-tuning lightweight adapters, while still relying on the LDA head for memory-free updates. Across offline and three CIL settings (high-shot, few-shot+, few-shot), FSA and especially FSA-FiLM consistently outperform state-of-the-art baselines, challenging the need for continual backbone adaptation. The paper also provides a practical metric based on embedding cosine distance to predict when body adaptation will be beneficial, highlighting when replay-free baselines are likely to suffice in real-world scenarios. Overall, FSA establishes a simple, strong replay-free baseline with broad applicability and insights into when and how to adapt representations in continual learning.

Abstract

In Class-Incremental Learning (CIL) an image classification system is exposed to new classes in each learning session and must be updated incrementally. Methods approaching this problem have updated both the classification head and the feature extractor body at each session of CIL. In this work, we develop a baseline method, First Session Adaptation (FSA), that sheds light on the efficacy of existing CIL approaches and allows us to assess the relative performance contributions from head and body adaption. FSA adapts a pre-trained neural network body only on the first learning session and fixes it thereafter; a head based on linear discriminant analysis (LDA), is then placed on top of the adapted body, allowing exact updates through CIL. FSA is replay-free i.e.~it does not memorize examples from previous sessions of continual learning. To empirically motivate FSA, we first consider a diverse selection of 22 image-classification datasets, evaluating different heads and body adaptation techniques in high/low-shot offline settings. We find that the LDA head performs well and supports CIL out-of-the-box. We also find that Featurewise Layer Modulation (FiLM) adapters are highly effective in the few-shot setting, and full-body adaption in the high-shot setting. Second, we empirically investigate various CIL settings including high-shot CIL and few-shot CIL, including settings that have previously been used in the literature. We show that FSA significantly improves over the state-of-the-art in 15 of the 16 settings considered. FSA with FiLM adapters is especially performant in the few-shot setting. These results indicate that current approaches to continuous body adaptation are not working as expected. Finally, we propose a measure that can be applied to a set of unlabelled inputs which is predictive of the benefits of body adaptation.

First Session Adaptation: A Strong Replay-Free Baseline for Class-Incremental Learning

TL;DR

This work addresses class-incremental learning under replay-free constraints by introducing First Session Adaptation (FSA), a baseline that adapts the backbone only in the first session and uses an incrementally updated LDA head to perform continual updates without memorizing past data. A FiLM-enabled variant (FSA-FiLM) further enhances few-shot performance by fine-tuning lightweight adapters, while still relying on the LDA head for memory-free updates. Across offline and three CIL settings (high-shot, few-shot+, few-shot), FSA and especially FSA-FiLM consistently outperform state-of-the-art baselines, challenging the need for continual backbone adaptation. The paper also provides a practical metric based on embedding cosine distance to predict when body adaptation will be beneficial, highlighting when replay-free baselines are likely to suffice in real-world scenarios. Overall, FSA establishes a simple, strong replay-free baseline with broad applicability and insights into when and how to adapt representations in continual learning.

Abstract

In Class-Incremental Learning (CIL) an image classification system is exposed to new classes in each learning session and must be updated incrementally. Methods approaching this problem have updated both the classification head and the feature extractor body at each session of CIL. In this work, we develop a baseline method, First Session Adaptation (FSA), that sheds light on the efficacy of existing CIL approaches and allows us to assess the relative performance contributions from head and body adaption. FSA adapts a pre-trained neural network body only on the first learning session and fixes it thereafter; a head based on linear discriminant analysis (LDA), is then placed on top of the adapted body, allowing exact updates through CIL. FSA is replay-free i.e.~it does not memorize examples from previous sessions of continual learning. To empirically motivate FSA, we first consider a diverse selection of 22 image-classification datasets, evaluating different heads and body adaptation techniques in high/low-shot offline settings. We find that the LDA head performs well and supports CIL out-of-the-box. We also find that Featurewise Layer Modulation (FiLM) adapters are highly effective in the few-shot setting, and full-body adaption in the high-shot setting. Second, we empirically investigate various CIL settings including high-shot CIL and few-shot CIL, including settings that have previously been used in the literature. We show that FSA significantly improves over the state-of-the-art in 15 of the 16 settings considered. FSA with FiLM adapters is especially performant in the few-shot setting. These results indicate that current approaches to continuous body adaptation are not working as expected. Finally, we propose a measure that can be applied to a set of unlabelled inputs which is predictive of the benefits of body adaptation.
Paper Structure (35 sections, 2 equations, 4 figures, 44 tables, 1 algorithm)

This paper contains 35 sections, 2 equations, 4 figures, 44 tables, 1 algorithm.

Figures (4)

  • Figure 1: Average accuracy across all VTAB+ datasets using no-adaptation (NA), FiLM adaptation (A-FiLM), and full body adaptation (A-FB) for different classifier heads (NCM, LDA, Linear) and number of shots (5, 10, 50, All Data). The results correspond to the offline setting where all classes are available without any incremental learning.
  • Figure 2: Scatter plot of the accuracy differences between FSA-FiLM and NA against the minimum cosine distance between a dataset and miniImagenet dataset evaluated using the NA method. We consider the offline setting with 50 shots. A pre-trained EfficientNet-B0 on ImageNet-1k is used as a backbone.
  • Figure 3: Bar plot of the accuracy differences between FSA-FiLM and NA for the offline case with 50 shots.
  • Figure 4: Last session's test accuracy ($\uparrow$) and run time ($\downarrow$) for the "high-shot CIL" setting of \ref{['sec:cil_comparisons']}. GDumb-$m$ refers to memory buffer sizes $m \in \{ 200, 500, 1\text{k}, 2\text{k}, 5\text{k}, 10\text{k}^* \}$. We use a memory buffer of 10k images only for CORE50.