Table of Contents
Fetching ...

Scale-Aware Curriculum Learning for Ddata-Efficient Lung Nodule Detection with YOLOv11

Yi Luo, Yike Guo, Hamed Hooshangnejad, Kai Ding

TL;DR

This work tackles data scarcity in 3D lung nodule detection by introducing Scale-Adaptive Curriculum Learning (SACL), which dynamically tunes curriculum design through adaptive epoch scheduling, hard sample injection, and scale-aware optimization. Evaluated on the LUNA25 dataset with YOLOv11, SACL matches static curriculum performance on the full dataset but provides clear advantages under limited data, achieving significant mAP$_{50}$ gains at 10%, 20%, and 50% data. Core ideas are formalized with adaptive rules: $E' = \max\{\rho^\beta E, \gamma E, E_{min}\}$, $r_{hard}^{min} = r_0 + (1-\rho)\Delta r$, and $\eta' = \eta\left[1 - 0.3(1-\rho)\frac{s}{S}\right]$, among others, enabling robust training across data scales. Overall, SACL offers a practical approach for healthcare institutions to deploy effective lung nodule detection systems when annotation resources are limited, with potential applicability to other medical-imaging tasks.

Abstract

Lung nodule detection in chest CT is crucial for early lung cancer diagnosis, yet existing deep learning approaches face challenges when deployed in clinical settings with limited annotated data. While curriculum learning has shown promise in improving model training, traditional static curriculum strategies fail in data-scarce scenarios. We propose Scale Adaptive Curriculum Learning (SACL), a novel training strategy that dynamically adjusts curriculum design based on available data scale. SACL introduces three key mechanisms:(1) adaptive epoch scheduling, (2) hard sample injection, and (3) scale-aware optimization. We evaluate SACL on the LUNA25 dataset using YOLOv11 as the base detector. Experimental results demonstrate that while SACL achieves comparable performance to static curriculum learning on the full dataset in mAP50, it shows significant advantages under data-limited conditions with 4.6%, 3.5%, and 2.0% improvements over baseline at 10%, 20%, and 50% of training data respectively. By enabling robust training across varying data scales without architectural modifications, SACL provides a practical solution for healthcare institutions to develop effective lung nodule detection systems despite limited annotation resources.

Scale-Aware Curriculum Learning for Ddata-Efficient Lung Nodule Detection with YOLOv11

TL;DR

This work tackles data scarcity in 3D lung nodule detection by introducing Scale-Adaptive Curriculum Learning (SACL), which dynamically tunes curriculum design through adaptive epoch scheduling, hard sample injection, and scale-aware optimization. Evaluated on the LUNA25 dataset with YOLOv11, SACL matches static curriculum performance on the full dataset but provides clear advantages under limited data, achieving significant mAP gains at 10%, 20%, and 50% data. Core ideas are formalized with adaptive rules: , , and , among others, enabling robust training across data scales. Overall, SACL offers a practical approach for healthcare institutions to deploy effective lung nodule detection systems when annotation resources are limited, with potential applicability to other medical-imaging tasks.

Abstract

Lung nodule detection in chest CT is crucial for early lung cancer diagnosis, yet existing deep learning approaches face challenges when deployed in clinical settings with limited annotated data. While curriculum learning has shown promise in improving model training, traditional static curriculum strategies fail in data-scarce scenarios. We propose Scale Adaptive Curriculum Learning (SACL), a novel training strategy that dynamically adjusts curriculum design based on available data scale. SACL introduces three key mechanisms:(1) adaptive epoch scheduling, (2) hard sample injection, and (3) scale-aware optimization. We evaluate SACL on the LUNA25 dataset using YOLOv11 as the base detector. Experimental results demonstrate that while SACL achieves comparable performance to static curriculum learning on the full dataset in mAP50, it shows significant advantages under data-limited conditions with 4.6%, 3.5%, and 2.0% improvements over baseline at 10%, 20%, and 50% of training data respectively. By enabling robust training across varying data scales without architectural modifications, SACL provides a practical solution for healthcare institutions to develop effective lung nodule detection systems despite limited annotation resources.

Paper Structure

This paper contains 11 sections, 5 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Comparison of CL and SACL training strategies. CL uses fixed stage setups regardless of dataset size, while SACL dynamically adjusts epoch counts, hard sample ratios, and optimization parameters based on available data volume.
  • Figure 2: Detection results comparison using 10% of training data. The first column shows the complexity score/rate for each case. The subsequent columns show results from different models: baseline, CL, SACL, and ground truth. Each row represents the same case. Green boxes indicate model predictions with confidence scores, red boxes show ground truth annotations. Absence of red boxes indicates negative samples, while absence of green boxes indicates the model predicted negative.