Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark
Marina Ceccon, Davide Dalle Pezze, Alessandro Fabris, Gian Antonio Susto
TL;DR
The paper tackles catastrophic forgetting in medical imaging by introducing a NIC benchmark that combines new class arrivals and domain shifts across 19 diseases and 7 tasks in two datasets. It proposes Replay Consolidation with Label Propagation (RCLP), which merges forward/backward label propagation, a masking loss, and feature distillation to maximize replay-memory utility while minimizing interference. Empirical results show RCLP outperforms standard replay, distillation, and hybrid baselines, achieving a mean F1 of about 0.27, a mean AUC of ~0.692, and forgetting around 2.4%, indicating robust multi-label continual learning in the medical domain. The work provides a practical benchmark and a scalable method with potential to extend to other modalities such as object detection and semantic segmentation in healthcare.
Abstract
Despite the critical importance of the medical domain in Deep Learning, most of the research in this area solely focuses on training models in static environments. It is only in recent years that research has begun to address dynamic environments and tackle the Catastrophic Forgetting problem through Continual Learning (CL) techniques. Previous studies have primarily focused on scenarios such as Domain Incremental Learning and Class Incremental Learning, which do not fully capture the complexity of real-world applications. Therefore, in this work, we propose a novel benchmark combining the challenges of new class arrivals and domain shifts in a single framework, by considering the New Instances and New Classes (NIC) scenario. This benchmark aims to model a realistic CL setting for the multi-label classification problem in medical imaging. Additionally, it encompasses a greater number of tasks compared to previously tested scenarios. Specifically, our benchmark consists of two datasets (NIH and CXP), nineteen classes, and seven tasks, a stream longer than the previously tested ones. To solve common challenges (e.g., the task inference problem) found in the CIL and NIC scenarios, we propose a novel approach called Replay Consolidation with Label Propagation (RCLP). Our method surpasses existing approaches, exhibiting superior performance with minimal forgetting.
