Table of Contents
Fetching ...

Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning

Min-Yeong Park, Jae-Ho Lee, Gyeong-Moon Park

TL;DR

This work proposes a simple yet effective IL framework, named Incremental Classifier with Adaptation Shift cONtrol (ICON), and designs a novel regularization method called Cluster-based Adaptation Shift conTrol (CAST) to control the model to avoid confusion with the previously learned knowledge and thereby accumulate the new knowledge more effectively.

Abstract

Incremental Learning (IL) aims to accumulate knowledge from sequential input tasks while overcoming catastrophic forgetting. Existing IL methods typically assume that an incoming task has only increments of classes or domains, referred to as Class IL (CIL) or Domain IL (DIL), respectively. In this work, we consider a more challenging and realistic but under-explored IL scenario, named Versatile Incremental Learning (VIL), in which a model has no prior of which of the classes or domains will increase in the next task. In the proposed VIL scenario, the model faces intra-class domain confusion and inter-domain class confusion, which makes the model fail to accumulate new knowledge without interference with learned knowledge. To address these issues, we propose a simple yet effective IL framework, named Incremental Classifier with Adaptation Shift cONtrol (ICON). Based on shifts of learnable modules, we design a novel regularization method called Cluster-based Adaptation Shift conTrol (CAST) to control the model to avoid confusion with the previously learned knowledge and thereby accumulate the new knowledge more effectively. Moreover, we introduce an Incremental Classifier (IC) which expands its output nodes to address the overwriting issue from different domains corresponding to a single class while maintaining the previous knowledge. We conducted extensive experiments on three benchmarks, showcasing the effectiveness of our method across all the scenarios, particularly in cases where the next task can be randomly altered. Our implementation code is available at https://github.com/KHU-AGI/VIL.

Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning

TL;DR

This work proposes a simple yet effective IL framework, named Incremental Classifier with Adaptation Shift cONtrol (ICON), and designs a novel regularization method called Cluster-based Adaptation Shift conTrol (CAST) to control the model to avoid confusion with the previously learned knowledge and thereby accumulate the new knowledge more effectively.

Abstract

Incremental Learning (IL) aims to accumulate knowledge from sequential input tasks while overcoming catastrophic forgetting. Existing IL methods typically assume that an incoming task has only increments of classes or domains, referred to as Class IL (CIL) or Domain IL (DIL), respectively. In this work, we consider a more challenging and realistic but under-explored IL scenario, named Versatile Incremental Learning (VIL), in which a model has no prior of which of the classes or domains will increase in the next task. In the proposed VIL scenario, the model faces intra-class domain confusion and inter-domain class confusion, which makes the model fail to accumulate new knowledge without interference with learned knowledge. To address these issues, we propose a simple yet effective IL framework, named Incremental Classifier with Adaptation Shift cONtrol (ICON). Based on shifts of learnable modules, we design a novel regularization method called Cluster-based Adaptation Shift conTrol (CAST) to control the model to avoid confusion with the previously learned knowledge and thereby accumulate the new knowledge more effectively. Moreover, we introduce an Incremental Classifier (IC) which expands its output nodes to address the overwriting issue from different domains corresponding to a single class while maintaining the previous knowledge. We conducted extensive experiments on three benchmarks, showcasing the effectiveness of our method across all the scenarios, particularly in cases where the next task can be randomly altered. Our implementation code is available at https://github.com/KHU-AGI/VIL.
Paper Structure (12 sections, 9 equations, 6 figures, 6 tables)

This paper contains 12 sections, 9 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Illustration of several IL scenarios, including our proposed new scenario, Versatile Incremental Learning. Each color shade indicates each domain group, and the solid box indicates each class group. The incremental step follows the red arrow.
  • Figure 2: Comparison of average accuracies among existing CIL and DIL methods in iDigits, CORe50, and DomainNet. In this figure, we compare the baselines that show the best performances in each benchmark, e.g., CODA-Prompt smith2023coda and S-Prompts wang2022sprompts in iDigits and CORe50, and LAE gao2023unified and S-Prompts wang2022sprompts in DomainNet. Our proposed ICON outperforms the previous state-of-the-art methods in all scenarios, including the challenging VIL setting.
  • Figure 3: Illustration of comparison of shifts in adapters when the type of IL remains the same or changes in DomainNet. Shifts are measured by subtracting the previous weights with weights after learning a task.
  • Figure 4: Architecture overview. In training time, the model calculates the current shift ${V_t^i}$ of learnable modules by subtracting them with previous ones. Then a cluster ${S_t^i}$ which ${V_t^i}$ belongs to is decided, and shifts in the the shift pool which belong to other clusters ${S_{t}^{i'}}$ are considered to be from disparate previous tasks. To guide the current learning toward a direction where it does not conflict with ${V_{j}}$, ${V_t^i}$ is regularized to be orthogonal to for all ${V_{j}}$ in ${S_{t}^{i'}}$. After learning a task, $V_t$ is saved as a shift in the shift pool which will be used for clustering afterwards.
  • Figure 5: Illustration of Incremental Classifier (IC) in training. The model increases the output node of classifier if needed whenever classes in current task $q$ had already learned before, i.e. task $p$. Nodes for remaining classes included in task $q$ are trained to preserve the knowledge via distillation. The original nodes with classes whose output nodes has been increased at task $q$ are kept intact by omitting them from cross-entropy loss.
  • ...and 1 more figures