Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Ruilin Li; Heming Zou; Xiufeng Yan; Zheming Liang; Jie Yang; Chenliang Li; Xue Yang

Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Ruilin Li, Heming Zou, Xiufeng Yan, Zheming Liang, Jie Yang, Chenliang Li, Xue Yang

Abstract

Recent paradigms in Random Projection Layer (RPL)-based continual representation learning have demonstrated superior performance when building upon a pre-trained model (PTM). These methods insert a randomly initialized RPL after a PTM to enhance feature representation in the initial stage. Subsequently, a linear classification head is used for analytic updates in the continual learning stage. However, under severe domain gaps between pre-trained representations and target domains, a randomly initialized RPL exhibits limited expressivity under large domain shifts. While largely scaling up the RPL dimension can improve expressivity, it also induces an ill-conditioned feature matrix, thereby destabilizing the recursive analytic updates of the linear head. To this end, we propose the Stochastic Continual Learner with MemoryGuard Supervisory Mechanism (SCL-MGSM). Unlike random initialization, MGSM constructs the projection layer via a principled, data-guided mechanism that progressively selects target-aligned random bases to adapt the PTM representation to downstream tasks. This facilitates the construction of a compact yet expressive RPL while improving the numerical stability of analytic updates. Extensive experiments on multiple exemplar-free Class Incremental Learning (CIL) benchmarks demonstrate that SCL-MGSM achieves superior performance compared to state-of-the-art methods.

Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Abstract

Paper Structure (29 sections, 1 theorem, 33 equations, 18 figures, 7 tables, 1 algorithm)

This paper contains 29 sections, 1 theorem, 33 equations, 18 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Revisiting RPL-based Analytic Continual Learning
Exemplar-free Class-Incremental Learning Setting
The Supervisory Mechanism of SCNs
Prior RPL-based Continual Representation Learning Methods Framework
Method
SCL-MGSM Construction
Underlying Rationale
Experiments
Experiment Setting
Main Results
Mechanism Analysis
Hyperparameter Sensitivity
Additional Analysis
...and 14 more sections

Key Result

Theorem 1

Let $\boldsymbol{y}\in\mathbb{R}^{N}$ be the target vector, and let $\boldsymbol{H}_{L-s} \in \mathbb{R}^{N \times (L-s)}$ be the output matrix of the current network $f_{L-s}$. Suppose $\boldsymbol{W}_{\beta_{L-s}} = [\beta_1,\dots,\beta_{L-s}]^{\top}$ is the output weights, and define the current If the batch of new hidden units with output $\boldsymbol{H}_s$ satisfy the following inequality an

Figures (18)

Figure 1: Overview of the prior RPL-based CIL paradigm and comparison of initial-stage RPL construction. (a) Prior Methods: After first stage adaptation, frozen PTM extracts features $\boldsymbol{Z}_{\text{init}}$, which are projected through a randomly initialized RPL ($\boldsymbol{W}_{\text{RPL}}$) to obtain high-dimensional features $\boldsymbol{H}_{\text{init}}$, followed by computing classifier weights $\boldsymbol{W}_{\beta}$. During incremental learning, new features $\boldsymbol{Z}_t$ pass through the same frozen $\boldsymbol{W}_{\text{RPL}}$ to $\boldsymbol{H}_t$, and only $\boldsymbol{W}_{\beta}$ is updated to $\boldsymbol{W}_{\beta}^{(t)}$ via recursive ridge regression. (b) Our Method: We leverage the initial task and PTM to inform MGSM-guided RPL construction. During incremental learning, $\boldsymbol{W}_{\beta}$ is updated recursively as in (a).
Figure 2: Overview of MGSM-guided RPL construction in SCL-MGSM. Data from any stage can serve as the initialization set to build the RPL from scratch. Random hidden units are progressively sampled, evaluated by MGSM, and appended to the RPL only if they satisfy the supervisory criterion. The construction terminates once the residual converges below a predefined threshold $\varepsilon$. See Appendix \ref{['a_MGSM_process']} for details.
Figure 2: Performance Comparison of RPL Construction Strategies Without FSA.
Figure 3: Gaussian Initialization.
Figure 3: Performance Comparison of $s$ and $B_{max}$ on ImageNet-R (B-0 Inc-5).
...and 13 more figures

Theorems & Definitions (3)

Theorem 1
proof
proof

Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Abstract

Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection

Authors

Abstract

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (3)