Is Complete Labeling Necessary? Understanding Active Learning in Longitudinal Medical Imaging
Siteng Ma, Honghui Du, Prateek Mathur, Brendan S. Kelly, Ronan P. Killeen, Aonghus Lawlor, Ruihai Dong
TL;DR
This work tackles the high labeling cost of detecting changes in longitudinal medical imaging by introducing Longitudinal Medical Imaging Active Learning (LMI-AL), a framework that tailors deep active learning to pairwise slice differences between baseline and follow-up scans. It transforms 3D MRI data into 2D slice pairs with their differences, builds an initial labeled pool from all possible pairs, and iteratively selects the most informative pairs under various query strategies to train a change-detection model. Across two MS lesions-focused MRI datasets, LMI-AL achieves performance akin to fully labeled models while labeling less than 8% of data for MSSEG-2 and less than 5% for SVUH, with diversity- and hybrid-based queries often performing best under budget constraints. These results suggest substantial practical savings in annotation effort for longitudinal imaging and provide guidance on effective DAL strategies for change detection in medical data.
Abstract
Detecting changes in longitudinal medical imaging using deep learning requires a substantial amount of accurately labeled data. However, labeling these images is notably more costly and time-consuming than labeling other image types, as it requires labeling across various time points, where new lesions can be minor, and subtle changes are easily missed. Deep Active Learning (DAL) has shown promise in minimizing labeling costs by selectively querying the most informative samples, but existing studies have primarily focused on static tasks like classification and segmentation. Consequently, the conventional DAL approach cannot be directly applied to change detection tasks, which involve identifying subtle differences across multiple images. In this study, we propose a novel DAL framework, named Longitudinal Medical Imaging Active Learning (LMI-AL), tailored specifically for longitudinal medical imaging. By pairing and differencing all 2D slices from baseline and follow-up 3D images, LMI-AL iteratively selects the most informative pairs for labeling using DAL, training a deep learning model with minimal manual annotation. Experimental results demonstrate that, with less than 8% of the data labeled, LMI-AL can achieve performance comparable to models trained on fully labeled datasets. We also provide a detailed analysis of the method's performance, as guidance for future research. The code is publicly available at https://github.com/HelenMa9998/Longitudinal_AL.
