Merge and Bound: Direct Manipulations on Weights for Class Incremental Learning
Taehoon Kim, Donghwan Jang, Bohyung Han
TL;DR
This paper tackles catastrophic forgetting in Class Incremental Learning by introducing Merge-and-Bound (M&B), a plug‑in training strategy that directly manipulates network weights. It uses inter‑task weight merging to form a base model across all previous stages and intra‑task weight merging to refine the current task trajectory, complemented by a bounded update that restricts weight changes to stay near the base model. The approach integrates with existing CIL methods and yields consistent performance gains on CIFAR‑100 and ImageNet‑100/1000 benchmarks, with pronounced improvements as the number of tasks grows and under limited memory budgets. Overall, M&B enhances stability and plasticity in continual learning by promoting reliable weight merging and preserving prior representations, offering a practical, low‑cost enhancement for real‑world continual learning deployments.
Abstract
We present a novel training approach, named Merge-and-Bound (M&B) for Class Incremental Learning (CIL), which directly manipulates model weights in the parameter space for optimization. Our algorithm involves two types of weight merging: inter-task weight merging and intra-task weight merging. Inter-task weight merging unifies previous models by averaging the weights of models from all previous stages. On the other hand, intra-task weight merging facilitates the learning of current task by combining the model parameters within current stage. For reliable weight merging, we also propose a bounded update technique that aims to optimize the target model with minimal cumulative updates and preserve knowledge from previous tasks; this strategy reveals that it is possible to effectively obtain new models near old ones, reducing catastrophic forgetting. M&B is seamlessly integrated into existing CIL methods without modifying architecture components or revising learning objectives. We extensively evaluate our algorithm on standard CIL benchmarks and demonstrate superior performance compared to state-of-the-art methods.
