Table of Contents
Fetching ...

DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images

Mohammad Areeb Qazi, Ibrahim Almakky, Anees Ur Rehman Hashmi, Santosh Sanjeev, Mohammad Yaqub

TL;DR

DynaMMo addresses catastrophic forgetting in class-incremental medical image learning by merging task-specific adapters learned during Adapter Tuning into a unified model, reducing computational overhead. The two-stage approach learns task-specific representations per CNN block and then averages adapters across tasks to form a merged set used with a balanced fine-tuned classifier. Empirical results across CIFAR100, PATH16, and SKIN8 demonstrate a ~10× reduction in GFLOPS with only a small drop in average accuracy, outperforming several baselines and remaining efficient as tasks scale. This work implies practical, resource-efficient continual learning for medical imaging where data privacy and limited compute are critical considerations.

Abstract

Continual learning, the ability to acquire knowledge from new data while retaining previously learned information, is a fundamental challenge in machine learning. Various approaches, including memory replay, knowledge distillation, model regularization, and dynamic network expansion, have been proposed to address this issue. Thus far, dynamic network expansion methods have achieved state-of-the-art performance at the cost of incurring significant computational overhead. This is due to the need for additional model buffers, which makes it less feasible in resource-constrained settings, particularly in the medical domain. To overcome this challenge, we propose Dynamic Model Merging, DynaMMo, a method that merges multiple networks at different stages of model training to achieve better computational efficiency. Specifically, we employ lightweight learnable modules for each task and combine them into a unified model to minimize computational overhead. DynaMMo achieves this without compromising performance, offering a cost-effective solution for continual learning in medical applications. We evaluate DynaMMo on three publicly available datasets, demonstrating its effectiveness compared to existing approaches. DynaMMo offers around 10-fold reduction in GFLOPS with a small drop of 2.76 in average accuracy when compared to state-of-the-art dynamic-based approaches. The code implementation of this work will be available upon the acceptance of this work at https://github.com/BioMedIA-MBZUAI/DynaMMo.

DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images

TL;DR

DynaMMo addresses catastrophic forgetting in class-incremental medical image learning by merging task-specific adapters learned during Adapter Tuning into a unified model, reducing computational overhead. The two-stage approach learns task-specific representations per CNN block and then averages adapters across tasks to form a merged set used with a balanced fine-tuned classifier. Empirical results across CIFAR100, PATH16, and SKIN8 demonstrate a ~10× reduction in GFLOPS with only a small drop in average accuracy, outperforming several baselines and remaining efficient as tasks scale. This work implies practical, resource-efficient continual learning for medical imaging where data privacy and limited compute are critical considerations.

Abstract

Continual learning, the ability to acquire knowledge from new data while retaining previously learned information, is a fundamental challenge in machine learning. Various approaches, including memory replay, knowledge distillation, model regularization, and dynamic network expansion, have been proposed to address this issue. Thus far, dynamic network expansion methods have achieved state-of-the-art performance at the cost of incurring significant computational overhead. This is due to the need for additional model buffers, which makes it less feasible in resource-constrained settings, particularly in the medical domain. To overcome this challenge, we propose Dynamic Model Merging, DynaMMo, a method that merges multiple networks at different stages of model training to achieve better computational efficiency. Specifically, we employ lightweight learnable modules for each task and combine them into a unified model to minimize computational overhead. DynaMMo achieves this without compromising performance, offering a cost-effective solution for continual learning in medical applications. We evaluate DynaMMo on three publicly available datasets, demonstrating its effectiveness compared to existing approaches. DynaMMo offers around 10-fold reduction in GFLOPS with a small drop of 2.76 in average accuracy when compared to state-of-the-art dynamic-based approaches. The code implementation of this work will be available upon the acceptance of this work at https://github.com/BioMedIA-MBZUAI/DynaMMo.
Paper Structure (13 sections, 2 equations, 3 figures, 4 tables)

This paper contains 13 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Our proposed method, DynaMMo for continual learning, where task-specific adapters are learned to capture task-specific features (Left). Subsequently, these task-specific adapters are merged before fine-tuning the classification head on a balanced set that includes replay data (Right).
  • Figure 2: Continual Learning Performance comparison between DynaMMo (ours), ICARL rebuffi2017icarl, UCIR hou2018lifelong, PODNET douillard2020podnet along with fixed Replay and standard incremental fine-tuning on (a) SKIN8 tschandl2018ham10000, (b) PATH16 zhang2023adapter, and (c) CIFAR100 krizhevsky2009learning datasets.
  • Figure 3: Confusion matrices for DynaMMo on (a) SKIN8 tschandl2018ham10000, (b) PATH16 zhang2023adapter, and (c) CIFAR100 krizhevsky2009learning datasets.