Table of Contents
Fetching ...

Dynamic Gradient Sparsification Training for Few-Shot Fine-tuning of CT Lymph Node Segmentation Foundation Model

Zihao Luo, Zijun Gao, Wenjun Liao, Shichuan Zhang, Guotai Wang, Xiangde Luo

TL;DR

This work tackles LN segmentation under data scarcity by building a LN segmentation foundation model trained on a large head-and-neck CT dataset, then enabling effective few-shot adaptation via Dynamic Gradient Sparsification Training ($DGST$). DGST dynamically selects and updates the top-$\gamma$ gradient parameters per kernel at each iteration, formalized through $G^N=\{ g_{\theta_i}^N=\nabla_{\theta_i}\mathcal{L}(\mathcal{D}_d)\}$ and $\mathcal{P}^N_S=\bigcup_k\text{Top}^{(\gamma)}_{\theta_i\in C_k}|g_{\theta_i}^N|$, updating only $\theta_i\in\mathcal{P}^N_S$ with the rule $\theta_i \leftarrow \theta_i - \eta g_{\theta_i}^N$. The foundation model is pre-trained by minimizing $\mathcal{L}_{CE+Dice}$ over $\mathcal{D}_F$ with $M_F=\arg\min_{\theta_F} \mathcal{L}_{CE+Dice}(\mathcal{D}_F:\theta_F)$, and fine-tuned on downstream data via $M_d=\arg\min_{\theta_d} \mathcal{L}_{CE+Dice}(\mathcal{D}_d:\theta_d) + R_{penalty}(\theta_F,\theta_d)$. Evaluations on SegRap2023 and LNQ2023 show DGST outperforms existing few-shot methods, approaching full-data performance in SegRap and delivering robust gains under cross-region variation in LNQ2023. The work provides practical contributions through the public release of the LN annotations, models, and implementations, enabling broader adoption in clinical workflows.

Abstract

Accurate lymph node (LN) segmentation is critical in radiotherapy treatment and prognosis analysis, but is limited by the need for large annotated datasets. While deep learning-based segmentation foundation models show potential in developing high-performing models with fewer samples, their medical adaptation faces LN domain-specific prior deficiencies and inefficient few-shot fine-tuning for complex clinical practices, highlighting the necessity of an LN segmentation foundation model. In this work, we annotated 36,106 visible LNs from 3,346 publicly available head-and-neck CT scans to establish a robust LN segmentation model (nnUNetv2). Building on this, we propose Dynamic Gradient Sparsification Training (DGST), a few-shot fine-tuning approach that preserves foundational knowledge while dynamically updating the most critical parameters of the LN segmentation model with few annotations. We validate it on two publicly available LN segmentation datasets: SegRap2023 and LNQ2023. The results show that DGST outperforms existing few-shot fine-tuning methods, achieving satisfactory performance with limited labeled data. We release the dataset, models and all implementations to facilitate relevant research: https://github.com/Zihaoluoh/LN-Seg-FM.

Dynamic Gradient Sparsification Training for Few-Shot Fine-tuning of CT Lymph Node Segmentation Foundation Model

TL;DR

This work tackles LN segmentation under data scarcity by building a LN segmentation foundation model trained on a large head-and-neck CT dataset, then enabling effective few-shot adaptation via Dynamic Gradient Sparsification Training (). DGST dynamically selects and updates the top- gradient parameters per kernel at each iteration, formalized through and , updating only with the rule . The foundation model is pre-trained by minimizing over with , and fine-tuned on downstream data via . Evaluations on SegRap2023 and LNQ2023 show DGST outperforms existing few-shot methods, approaching full-data performance in SegRap and delivering robust gains under cross-region variation in LNQ2023. The work provides practical contributions through the public release of the LN annotations, models, and implementations, enabling broader adoption in clinical workflows.

Abstract

Accurate lymph node (LN) segmentation is critical in radiotherapy treatment and prognosis analysis, but is limited by the need for large annotated datasets. While deep learning-based segmentation foundation models show potential in developing high-performing models with fewer samples, their medical adaptation faces LN domain-specific prior deficiencies and inefficient few-shot fine-tuning for complex clinical practices, highlighting the necessity of an LN segmentation foundation model. In this work, we annotated 36,106 visible LNs from 3,346 publicly available head-and-neck CT scans to establish a robust LN segmentation model (nnUNetv2). Building on this, we propose Dynamic Gradient Sparsification Training (DGST), a few-shot fine-tuning approach that preserves foundational knowledge while dynamically updating the most critical parameters of the LN segmentation model with few annotations. We validate it on two publicly available LN segmentation datasets: SegRap2023 and LNQ2023. The results show that DGST outperforms existing few-shot fine-tuning methods, achieving satisfactory performance with limited labeled data. We release the dataset, models and all implementations to facilitate relevant research: https://github.com/Zihaoluoh/LN-Seg-FM.

Paper Structure

This paper contains 11 sections, 5 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: (a) Pre-training of the foundation model using 3k+ HN CT volumes and 36k+ visible lymph node annotations. (b) Few-shot fine-tuning to downstream tasks by transferring the pre-trained model to new datasets via Dynamic Gradient Sparsification Training (DGST). (c) DGST methodology: At each iteration, the parameters $\mathcal{P}_O$ are sparsified to $\mathcal{P}^N_S$ using the current gradient $G^N$ for each kernel, and then optimized.
  • Figure 2: Qualitative comparison of different fine-tuning methods. The ground truth and predictions are shown in green and yellow contours, respectively.
  • Figure 3: Sensitivity analysis of hyperparameter $\gamma$