Table of Contents
Fetching ...

Incremental Online Learning of Randomized Neural Network with Forward Regularization

Junda Wang, Minghui Hu, Ning Li, Abdulaziz Al-Ali, Ponnuthurai Nagaratnam Suganthan

TL;DR

This work tackles the challenge of online learning for deep randomized networks under strict online constraints by introducing Incremental Online Learning (IOL) with ridge (-R) and forward (-F) regularization. It develops closed-form, recursive update rules for edRVFL architectures, derives regret bounds under adversarial batch streams, and demonstrates that forward regularization yields substantially tighter regrets and faster adaptation. Theoretical results are complemented by extensive experiments across regression, classification, and large-scale datasets, showing that edRVFL-F consistently outperforms edRVFL-R and other baselines in online settings. The study thus offers a practical, theoretically grounded pathway to robust, memory-efficient online learning with deep randomized architectures, suitable for real-time decision-making in non-stationary environments.

Abstract

Online learning of deep neural networks suffers from challenges such as hysteretic non-incremental updating, increasing memory usage, past retrospective retraining, and catastrophic forgetting. To alleviate these drawbacks and achieve progressive immediate decision-making, we propose a novel Incremental Online Learning (IOL) process of Randomized Neural Networks (Randomized NN), a framework facilitating continuous improvements to Randomized NN performance in restrictive online scenarios. Within the framework, we further introduce IOL with ridge regularization (-R) and IOL with forward regularization (-F). -R generates stepwise incremental updates without retrospective retraining and avoids catastrophic forgetting. Moreover, we substituted -R with -F as it enhanced precognition learning ability using semi-supervision and realized better online regrets to offline global experts compared to -R during IOL. The algorithms of IOL for Randomized NN with -R/-F on non-stationary batch stream were derived respectively, featuring recursive weight updates and variable learning rates. Additionally, we conducted a detailed analysis and theoretically derived relative cumulative regret bounds of the Randomized NN learners with -R/-F in IOL under adversarial assumptions using a novel methodology and presented several corollaries, from which we observed the superiority on online learning acceleration and regret bounds of employing -F in IOL. Finally, our proposed methods were rigorously examined across regression and classification tasks on diverse datasets, which distinctly validated the efficacy of IOL frameworks of Randomized NN and the advantages of forward regularization.

Incremental Online Learning of Randomized Neural Network with Forward Regularization

TL;DR

This work tackles the challenge of online learning for deep randomized networks under strict online constraints by introducing Incremental Online Learning (IOL) with ridge (-R) and forward (-F) regularization. It develops closed-form, recursive update rules for edRVFL architectures, derives regret bounds under adversarial batch streams, and demonstrates that forward regularization yields substantially tighter regrets and faster adaptation. Theoretical results are complemented by extensive experiments across regression, classification, and large-scale datasets, showing that edRVFL-F consistently outperforms edRVFL-R and other baselines in online settings. The study thus offers a practical, theoretically grounded pathway to robust, memory-efficient online learning with deep randomized architectures, suitable for real-time decision-making in non-stationary environments.

Abstract

Online learning of deep neural networks suffers from challenges such as hysteretic non-incremental updating, increasing memory usage, past retrospective retraining, and catastrophic forgetting. To alleviate these drawbacks and achieve progressive immediate decision-making, we propose a novel Incremental Online Learning (IOL) process of Randomized Neural Networks (Randomized NN), a framework facilitating continuous improvements to Randomized NN performance in restrictive online scenarios. Within the framework, we further introduce IOL with ridge regularization (-R) and IOL with forward regularization (-F). -R generates stepwise incremental updates without retrospective retraining and avoids catastrophic forgetting. Moreover, we substituted -R with -F as it enhanced precognition learning ability using semi-supervision and realized better online regrets to offline global experts compared to -R during IOL. The algorithms of IOL for Randomized NN with -R/-F on non-stationary batch stream were derived respectively, featuring recursive weight updates and variable learning rates. Additionally, we conducted a detailed analysis and theoretically derived relative cumulative regret bounds of the Randomized NN learners with -R/-F in IOL under adversarial assumptions using a novel methodology and presented several corollaries, from which we observed the superiority on online learning acceleration and regret bounds of employing -F in IOL. Finally, our proposed methods were rigorously examined across regression and classification tasks on diverse datasets, which distinctly validated the efficacy of IOL frameworks of Randomized NN and the advantages of forward regularization.

Paper Structure

This paper contains 24 sections, 22 theorems, 60 equations, 16 figures, 14 tables, 2 algorithms.

Key Result

Lemma 1

Offline learning refers to the learning process of offline expert on global dataset. Assume ${\ell _t}$ and ${U_0}(\theta )$ are differentiable and convex, subscript $0$ denotes initial setup of prior knowledge, and there always exists a solution in $\Theta$: where ${U_{T + 1}}(\theta ) = {\Delta _{{U_0}}}(\theta ,{\theta _0}) + {\ell _{1..T}}(\theta )$, ${\theta _{T + 1}}$ represents the updated

Figures (16)

  • Figure 1: Task stream is learned incrementally. The proposed IOL frameworks investigate IL processes with -R/-F for deep structures under restricted conditions, such as limited retrieval and reuse.
  • Figure 2: The IOL processes of edRVFL-R/F on batch stream. Stream over time is shown in varied rufous arrow. Extracted features inside edRVFL multilayers are painted respectively in varied brightness blue and yellow to distinguish algorithmic styles of -R and -F. Clustered trainable learners inside edRVFL are progressively updated and uncertainty is gradually removed (shown by increasing sharpness) to present improving performance as data batches come. The past chunks can be discarded without retrievals in the processes. Splines denote the participation of foresight data in -F style.
  • Figure 3: One online learner inside edRVFL uses -R and -F to update respectively. The learner using ridge (blue solid marked) try to match ${L_{1..t}}$ (dark red) and evolve into $\beta _{t + 1}^r$ (blue arrow) on batch $t$, and same logic as the $\beta _{t + 2}^r$ (blue dashed) at time $t+1$ for updating and prediction $pred(\beta _{t + 2}^r)$ over ${L_{1..t+1}}$ (light red dashed). The learner with forward term (yellow solid) try to meet ${L_{1..t}} + {\hat{L}_{t + 1}}$ (light red solid) including estimated cost at time $t$. It is suggested that the -F style can rectify the IOL process.
  • Figure 4: Testset RMSE variation curves of edRVFL-R/F baselines on regression task during IOL processes. Baselines' setup and values can be referred to Table \ref{['table 5']}, and performance is displayed by testset RMSE over time (data batches). Solid lines: immediate ensemble RMSE of baselines. Dashed lines: cumulative ensemble RMSE. Boxplots: immediate RMSE statistics of interior multiple sub-learners (layers). X-axis is locally enlarged in the inlaid subfigure.
  • Figure 5: Ablation comparative experiments of IOL processes for edRVFL-R/F on regression task. Baselines' setups are in Table \ref{['table 5']}, and $[ \cdot ]$ denotes rounding operation. For edRVFL-R/F with varied setups, performance variations during IOL processes can be displayed by immediate testset RMSE over time. Some figures enlarge local X-axis in the inlaid subfigures.
  • ...and 11 more figures

Theorems & Definitions (26)

  • Definition 1
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 6
  • ...and 16 more