Table of Contents
Fetching ...

Grow, Assess, Compress: Adaptive Backbone Scaling for Memory-Efficient Class Incremental Learning

Adrian Garcia-Castañeda, Jon Irureta, Jon Imaz, Aizea Lojo

TL;DR

This paper proposes a novel dynamic scaling framework that adaptively manages model capacity through a cyclic"GRow, Assess, ComprEss"(GRACE) strategy, and supplements backbone expansion with a novel saturation assessment phase that evaluates the utilization of the model's capacity.

Abstract

Class Incremental Learning (CIL) poses a fundamental challenge: maintaining a balance between the plasticity required to learn new tasks and the stability needed to prevent catastrophic forgetting. While expansion-based methods effectively mitigate forgetting by adding task-specific parameters, they suffer from uncontrolled architectural growth and memory overhead. In this paper, we propose a novel dynamic scaling framework that adaptively manages model capacity through a cyclic "GRow, Assess, ComprEss" (GRACE) strategy. Crucially, we supplement backbone expansion with a novel saturation assessment phase that evaluates the utilization of the model's capacity. This assessment allows the framework to make informed decisions to either expand the architecture or compress the backbones into a streamlined representation, preventing parameter explosion. Experimental results demonstrate that our approach achieves state-of-the-art performance across multiple CIL benchmarks, while reducing memory footprint by up to a 73% compared to purely expansionist models.

Grow, Assess, Compress: Adaptive Backbone Scaling for Memory-Efficient Class Incremental Learning

TL;DR

This paper proposes a novel dynamic scaling framework that adaptively manages model capacity through a cyclic"GRow, Assess, ComprEss"(GRACE) strategy, and supplements backbone expansion with a novel saturation assessment phase that evaluates the utilization of the model's capacity.

Abstract

Class Incremental Learning (CIL) poses a fundamental challenge: maintaining a balance between the plasticity required to learn new tasks and the stability needed to prevent catastrophic forgetting. While expansion-based methods effectively mitigate forgetting by adding task-specific parameters, they suffer from uncontrolled architectural growth and memory overhead. In this paper, we propose a novel dynamic scaling framework that adaptively manages model capacity through a cyclic "GRow, Assess, ComprEss" (GRACE) strategy. Crucially, we supplement backbone expansion with a novel saturation assessment phase that evaluates the utilization of the model's capacity. This assessment allows the framework to make informed decisions to either expand the architecture or compress the backbones into a streamlined representation, preventing parameter explosion. Experimental results demonstrate that our approach achieves state-of-the-art performance across multiple CIL benchmarks, while reducing memory footprint by up to a 73% compared to purely expansionist models.
Paper Structure (29 sections, 15 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 29 sections, 15 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Mean average accuracy (top left) and parameter count (top right) on CIFAR-100 across different split protocols. The bottom panel illustrates the expansion dynamics for the Base 50 Inc 2 setting. Unlike baseline methods that expand arbitrarily each task or remain static, our proposed framework selectively grows the network only when necessary, obtaining state-of-the-art performance with a significantly lower parameter count.
  • Figure 2: Overview of the proposed GRACE framework (best viewed in colour). The incorporation of each novel task follows a three-stage pipeline: (1) Grow, the architecture expansion phase (Section \ref{['sec:growphase']}); (2) Assess, the capacity evaluation phase (Section \ref{['sec:assessphase']}); and (3) Compress, the model consolidation phase (Section \ref{['sec:compressphase']}).
  • Figure 3: Model consolidation phase of the proposed GRACE framework. The expanded model ($f_{exp}$) acts as a teacher to guide the training of the compressed student model ($f_{com}$). Knowledge transfer is facilitated through a multi-objective loss function: logit-level distillation ($\mathcal{L}_{kd}$), feature-level alignment ($\mathcal{L}_{feat}$), and cross-entropy loss ($\mathcal{L}_{ce}$) to maintain classification performance.
  • Figure 4: GRACE's expansion dynamics (left) and accuracy change (right) on CIFAR-100 Base 0 Inc 10 with different threshold and decay configurations.
  • Figure 5: Sensitivity analysis of threshold and threshold decay values on CIFAR-100. The plots compare performance across different protocol settings, highlighting the Pareto frontier for hyperparameter selection. Note that some points may be overlapped, meaning that runs with different threshold values produced equal compression decisions.