Table of Contents
Fetching ...

Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation

Xiwen Wei, Guihong Li, Radu Marculescu

TL;DR

This paper introduces Online-LoRA, a novel framework for task-free OCL that allows to finetune pre-trained Vision Transformer models in real-time to address the limitations of rehearsal buffers and leverage pre-trained models' performance benefits.

Abstract

Catastrophic forgetting is a significant challenge in online continual learning (OCL), especially for non-stationary data streams that do not have well-defined task boundaries. This challenge is exacerbated by the memory constraints and privacy concerns inherent in rehearsal buffers. To tackle catastrophic forgetting, in this paper, we introduce Online-LoRA, a novel framework for task-free OCL. Online-LoRA allows to finetune pre-trained Vision Transformer (ViT) models in real-time to address the limitations of rehearsal buffers and leverage pre-trained models' performance benefits. As the main contribution, our approach features a novel online weight regularization strategy to identify and consolidate important model parameters. Moreover, Online-LoRA leverages the training dynamics of loss values to enable the automatic recognition of the data distribution shifts. Extensive experiments across many task-free OCL scenarios and benchmark datasets (including CIFAR-100, ImageNet-R, ImageNet-S, CUB-200 and CORe50) demonstrate that Online-LoRA can be robustly adapted to various ViT architectures, while achieving better performance compared to SOTA methods. Our code will be publicly available at: https://github.com/Christina200/Online-LoRA-official.git.

Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation

TL;DR

This paper introduces Online-LoRA, a novel framework for task-free OCL that allows to finetune pre-trained Vision Transformer models in real-time to address the limitations of rehearsal buffers and leverage pre-trained models' performance benefits.

Abstract

Catastrophic forgetting is a significant challenge in online continual learning (OCL), especially for non-stationary data streams that do not have well-defined task boundaries. This challenge is exacerbated by the memory constraints and privacy concerns inherent in rehearsal buffers. To tackle catastrophic forgetting, in this paper, we introduce Online-LoRA, a novel framework for task-free OCL. Online-LoRA allows to finetune pre-trained Vision Transformer (ViT) models in real-time to address the limitations of rehearsal buffers and leverage pre-trained models' performance benefits. As the main contribution, our approach features a novel online weight regularization strategy to identify and consolidate important model parameters. Moreover, Online-LoRA leverages the training dynamics of loss values to enable the automatic recognition of the data distribution shifts. Extensive experiments across many task-free OCL scenarios and benchmark datasets (including CIFAR-100, ImageNet-R, ImageNet-S, CUB-200 and CORe50) demonstrate that Online-LoRA can be robustly adapted to various ViT architectures, while achieving better performance compared to SOTA methods. Our code will be publicly available at: https://github.com/Christina200/Online-LoRA-official.git.

Paper Structure

This paper contains 38 sections, 9 equations, 6 figures, 16 tables.

Figures (6)

  • Figure 1: The overview of Online-LoRA. As the data is continuously streamed (a), a new pair of trainable LoRA parameters ($A_4, B_4$) is added (b) every time the loss surface encounters a plateau (c). Subsequently, the previous LoRA parameters ($A_1, B_1; A_2, B_2; A_3, B_3$) are frozen (the lock sign in (b)) and merged to the weights of the pre-trained ViT model.
  • Figure 2: Average accuracy versus number of samples for Si-Blurry CIFAR-100, ImageNet-R, and ImageNet-S scenarios. As shown, the Online-LoRA consistently outperforms competing methods, maintaining high accuracy throughout.
  • Figure 3: Test accuracy of three tasks versus the number of learning tasks. ViT-B/16 model is used on Split ImageNet-S with 20 tasks. The accuracy for each task prior to the model being trained on it is recorded as zero, since no measurements are taken at that stage, as the model has not yet been exposed to the corresponding task.
  • Figure 4: Loss surface of Online-LoRA on Split CIFAR-100 using ViT-B/16 model. Note that other peaks and plateaus exist but are not marked.
  • Figure 5: Task accuracy versus the number of learning tasks of task #2 to task #9. Our Online-LoRA consistently outperforms all the other methods in maintaining accuracy on previously learned tasks. Note that the recorded accuracy for initial tasks is zero, not due to poor model performance, but because our evaluation prioritizes mitigating forgetting in tasks the model has already encountered.
  • ...and 1 more figures