DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images
Mohammad Areeb Qazi, Ibrahim Almakky, Anees Ur Rehman Hashmi, Santosh Sanjeev, Mohammad Yaqub
TL;DR
DynaMMo addresses catastrophic forgetting in class-incremental medical image learning by merging task-specific adapters learned during Adapter Tuning into a unified model, reducing computational overhead. The two-stage approach learns task-specific representations per CNN block and then averages adapters across tasks to form a merged set used with a balanced fine-tuned classifier. Empirical results across CIFAR100, PATH16, and SKIN8 demonstrate a ~10× reduction in GFLOPS with only a small drop in average accuracy, outperforming several baselines and remaining efficient as tasks scale. This work implies practical, resource-efficient continual learning for medical imaging where data privacy and limited compute are critical considerations.
Abstract
Continual learning, the ability to acquire knowledge from new data while retaining previously learned information, is a fundamental challenge in machine learning. Various approaches, including memory replay, knowledge distillation, model regularization, and dynamic network expansion, have been proposed to address this issue. Thus far, dynamic network expansion methods have achieved state-of-the-art performance at the cost of incurring significant computational overhead. This is due to the need for additional model buffers, which makes it less feasible in resource-constrained settings, particularly in the medical domain. To overcome this challenge, we propose Dynamic Model Merging, DynaMMo, a method that merges multiple networks at different stages of model training to achieve better computational efficiency. Specifically, we employ lightweight learnable modules for each task and combine them into a unified model to minimize computational overhead. DynaMMo achieves this without compromising performance, offering a cost-effective solution for continual learning in medical applications. We evaluate DynaMMo on three publicly available datasets, demonstrating its effectiveness compared to existing approaches. DynaMMo offers around 10-fold reduction in GFLOPS with a small drop of 2.76 in average accuracy when compared to state-of-the-art dynamic-based approaches. The code implementation of this work will be available upon the acceptance of this work at https://github.com/BioMedIA-MBZUAI/DynaMMo.
