Table of Contents
Fetching ...

Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Ruilin Li, Heming Zou, Xiufeng Yan, Zheming Liang, Jie Yang, Chenliang Li, Xue Yang

Abstract

Recent paradigms in Random Projection Layer (RPL)-based continual representation learning have demonstrated superior performance when building upon a pre-trained model (PTM). These methods insert a randomly initialized RPL after a PTM to enhance feature representation in the initial stage. Subsequently, a linear classification head is used for analytic updates in the continual learning stage. However, under severe domain gaps between pre-trained representations and target domains, a randomly initialized RPL exhibits limited expressivity under large domain shifts. While largely scaling up the RPL dimension can improve expressivity, it also induces an ill-conditioned feature matrix, thereby destabilizing the recursive analytic updates of the linear head. To this end, we propose the Stochastic Continual Learner with MemoryGuard Supervisory Mechanism (SCL-MGSM). Unlike random initialization, MGSM constructs the projection layer via a principled, data-guided mechanism that progressively selects target-aligned random bases to adapt the PTM representation to downstream tasks. This facilitates the construction of a compact yet expressive RPL while improving the numerical stability of analytic updates. Extensive experiments on multiple exemplar-free Class Incremental Learning (CIL) benchmarks demonstrate that SCL-MGSM achieves superior performance compared to state-of-the-art methods.

Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Abstract

Recent paradigms in Random Projection Layer (RPL)-based continual representation learning have demonstrated superior performance when building upon a pre-trained model (PTM). These methods insert a randomly initialized RPL after a PTM to enhance feature representation in the initial stage. Subsequently, a linear classification head is used for analytic updates in the continual learning stage. However, under severe domain gaps between pre-trained representations and target domains, a randomly initialized RPL exhibits limited expressivity under large domain shifts. While largely scaling up the RPL dimension can improve expressivity, it also induces an ill-conditioned feature matrix, thereby destabilizing the recursive analytic updates of the linear head. To this end, we propose the Stochastic Continual Learner with MemoryGuard Supervisory Mechanism (SCL-MGSM). Unlike random initialization, MGSM constructs the projection layer via a principled, data-guided mechanism that progressively selects target-aligned random bases to adapt the PTM representation to downstream tasks. This facilitates the construction of a compact yet expressive RPL while improving the numerical stability of analytic updates. Extensive experiments on multiple exemplar-free Class Incremental Learning (CIL) benchmarks demonstrate that SCL-MGSM achieves superior performance compared to state-of-the-art methods.
Paper Structure (29 sections, 1 theorem, 33 equations, 18 figures, 7 tables, 1 algorithm)

This paper contains 29 sections, 1 theorem, 33 equations, 18 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Let $\boldsymbol{y}\in\mathbb{R}^{N}$ be the target vector, and let $\boldsymbol{H}_{L-s} \in \mathbb{R}^{N \times (L-s)}$ be the output matrix of the current network $f_{L-s}$. Suppose $\boldsymbol{W}_{\beta_{L-s}} = [\beta_1,\dots,\beta_{L-s}]^{\top}$ is the output weights, and define the current If the batch of new hidden units with output $\boldsymbol{H}_s$ satisfy the following inequality an

Figures (18)

  • Figure 1: Overview of the prior RPL-based CIL paradigm and comparison of initial-stage RPL construction. (a) Prior Methods: After first stage adaptation, frozen PTM extracts features $\boldsymbol{Z}_{\text{init}}$, which are projected through a randomly initialized RPL ($\boldsymbol{W}_{\text{RPL}}$) to obtain high-dimensional features $\boldsymbol{H}_{\text{init}}$, followed by computing classifier weights $\boldsymbol{W}_{\beta}$. During incremental learning, new features $\boldsymbol{Z}_t$ pass through the same frozen $\boldsymbol{W}_{\text{RPL}}$ to $\boldsymbol{H}_t$, and only $\boldsymbol{W}_{\beta}$ is updated to $\boldsymbol{W}_{\beta}^{(t)}$ via recursive ridge regression. (b) Our Method: We leverage the initial task and PTM to inform MGSM-guided RPL construction. During incremental learning, $\boldsymbol{W}_{\beta}$ is updated recursively as in (a).
  • Figure 2: Overview of MGSM-guided RPL construction in SCL-MGSM. Data from any stage can serve as the initialization set to build the RPL from scratch. Random hidden units are progressively sampled, evaluated by MGSM, and appended to the RPL only if they satisfy the supervisory criterion. The construction terminates once the residual converges below a predefined threshold $\varepsilon$. See Appendix \ref{['a_MGSM_process']} for details.
  • Figure 2: Performance Comparison of RPL Construction Strategies Without FSA.
  • Figure 3: Gaussian Initialization.
  • Figure 3: Performance Comparison of $s$ and $B_{max}$ on ImageNet-R (B-0 Inc-5).
  • ...and 13 more figures

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • proof