Automatic Quantification of Serial PET/CT Images for Pediatric Hodgkin Lymphoma Patients Using a Longitudinally-Aware Segmentation Network

Xin Tie; Muheon Shin; Changhee Lee; Scott B. Perlman; Zachary Huemann; Amy J. Weisman; Sharon M. Castellino; Kara M. Kelly; Kathleen M. McCarten; Adina L. Alazraki; Junjie Hu; Steve Y. Cho; Tyler J. Bradshaw

Automatic Quantification of Serial PET/CT Images for Pediatric Hodgkin Lymphoma Patients Using a Longitudinally-Aware Segmentation Network

Xin Tie, Muheon Shin, Changhee Lee, Scott B. Perlman, Zachary Huemann, Amy J. Weisman, Sharon M. Castellino, Kara M. Kelly, Kathleen M. McCarten, Adina L. Alazraki, Junjie Hu, Steve Y. Cho, Tyler J. Bradshaw

TL;DR

This work tackles the challenge of automatically quantifying longitudinal PET/CT changes in pediatric Hodgkin lymphoma by introducing LAS-Net, a dual-branch segmentation network that leverages longitudinal cross-attention to inform interim PET analysis with baseline information. The model integrates a longitudinally-aware window attention (LAWA) and a longitudinally-aware attention gate (LAAG) within a SwinUNETR backbone, enabling improved detection of residual disease on interim scans while preserving baseline segmentation accuracy. Across internal and external cohorts, LAS-Net demonstrates strong correlations with physician-derived metrics (e.g., $MTV$, $TLG$, $qPET$, and $ riangle SUV_{max}$) and superior interim lesion detection (F1 ≈ 0.606) compared to several baselines, with notable gains in DS agreement. The approach highlights the value of incorporating longitudinal context in DL models for multi-time-point imaging and offers a pathway toward more rapid, objective, and scalable response assessment in pediatric lymphoma.

Abstract

$\textbf{Purpose}$: Automatic quantification of longitudinal changes in PET scans for lymphoma patients has proven challenging, as residual disease in interim-therapy scans is often subtle and difficult to detect. Our goal was to develop a longitudinally-aware segmentation network (LAS-Net) that can quantify serial PET/CT images for pediatric Hodgkin lymphoma patients. $\textbf{Materials and Methods}$: This retrospective study included baseline (PET1) and interim (PET2) PET/CT images from 297 patients enrolled in two Children's Oncology Group clinical trials (AHOD1331 and AHOD0831). LAS-Net incorporates longitudinal cross-attention, allowing relevant features from PET1 to inform the analysis of PET2. Model performance was evaluated using Dice coefficients for PET1 and detection F1 scores for PET2. Additionally, we extracted and compared quantitative PET metrics, including metabolic tumor volume (MTV) and total lesion glycolysis (TLG) in PET1, as well as qPET and $Δ$SUVmax in PET2, against physician measurements. We quantified their agreement using Spearman's $ρ$ correlations and employed bootstrap resampling for statistical analysis. $\textbf{Results}$: LAS-Net detected residual lymphoma in PET2 with an F1 score of 0.606 (precision/recall: 0.615/0.600), outperforming all comparator methods (P<0.01). For baseline segmentation, LAS-Net achieved a mean Dice score of 0.772. In PET quantification, LAS-Net's measurements of qPET, $Δ$SUVmax, MTV and TLG were strongly correlated with physician measurements, with Spearman's $ρ$ of 0.78, 0.80, 0.93 and 0.96, respectively. The performance remained high, with a slight decrease, in an external testing cohort. $\textbf{Conclusion}$: LAS-Net demonstrated significant improvements in quantifying PET metrics across serial scans, highlighting the value of longitudinal awareness in evaluating multi-time-point imaging datasets.

Automatic Quantification of Serial PET/CT Images for Pediatric Hodgkin Lymphoma Patients Using a Longitudinally-Aware Segmentation Network

TL;DR

, and

) and superior interim lesion detection (F1 ≈ 0.606) compared to several baselines, with notable gains in DS agreement. The approach highlights the value of incorporating longitudinal context in DL models for multi-time-point imaging and offers a pathway toward more rapid, objective, and scalable response assessment in pediatric lymphoma.

Abstract

: Automatic quantification of longitudinal changes in PET scans for lymphoma patients has proven challenging, as residual disease in interim-therapy scans is often subtle and difficult to detect. Our goal was to develop a longitudinally-aware segmentation network (LAS-Net) that can quantify serial PET/CT images for pediatric Hodgkin lymphoma patients.

: This retrospective study included baseline (PET1) and interim (PET2) PET/CT images from 297 patients enrolled in two Children's Oncology Group clinical trials (AHOD1331 and AHOD0831). LAS-Net incorporates longitudinal cross-attention, allowing relevant features from PET1 to inform the analysis of PET2. Model performance was evaluated using Dice coefficients for PET1 and detection F1 scores for PET2. Additionally, we extracted and compared quantitative PET metrics, including metabolic tumor volume (MTV) and total lesion glycolysis (TLG) in PET1, as well as qPET and

SUVmax in PET2, against physician measurements. We quantified their agreement using Spearman's

correlations and employed bootstrap resampling for statistical analysis.

: LAS-Net detected residual lymphoma in PET2 with an F1 score of 0.606 (precision/recall: 0.615/0.600), outperforming all comparator methods (P<0.01). For baseline segmentation, LAS-Net achieved a mean Dice score of 0.772. In PET quantification, LAS-Net's measurements of qPET,

SUVmax, MTV and TLG were strongly correlated with physician measurements, with Spearman's

of 0.78, 0.80, 0.93 and 0.96, respectively. The performance remained high, with a slight decrease, in an external testing cohort.

: LAS-Net demonstrated significant improvements in quantifying PET metrics across serial scans, highlighting the value of longitudinal awareness in evaluating multi-time-point imaging datasets.

Paper Structure (17 sections, 3 equations, 11 figures, 6 tables)

This paper contains 17 sections, 3 equations, 11 figures, 6 tables.

Introduction
Materials and Methods
Patient Cohort
Data Labeling
LAS-Net Architecture
Quantitative PET Metrics
Model Comparison
Agreement of Predicted DSs and Physician-assigned DSs
Statistical Analysis
Data Availability
Results
Quantitative Performance
Qualitative evaluation
Agreement of Model-Extract DS and Physician Assigned DS
Ablation Studies
...and 2 more sections

Figures (11)

Figure 1: The architecture of longitudinally-aware segmentation network (LAS-Net). (A) The dual-branch design accommodates baseline (PET1) and interim (PET2) PET/CT images. One branch is dedicated to processing PET1 while the other branch focuses on PET2, using features extracted from PET2 as well as the features from the PET1 branch. (B) The longitudinally-aware window attention (LAWA) module introduces multi-head cross-attention following two self-attention blocks. All attention layers have a window size of 7. (C) The longitudinally-aware attention gate (LAAG) introduces a learnable convolutional layer (kernel size=7) following the standard self-attention gate to refine the attention coefficients for PET2. Both LAWA and LAAG modules only allow one-way information flow from the PET1 to the PET2 branch.
Figure 2: Performance comparison of interim PET lesion detection in the internal cohort. Results are reported with and without mask propagation through deformable registration (MPDR). Notably, CKD-Trans and ST-Trans utilized baseline lesion masks predicted by DynUNet for MPDR. (A) and (B) present the results of detection F1 scores, precision, and recall using different criteria to classify true positives. In (A), a predicted lesion is classified as a true positive if it overlaps with at least one voxel of the reference lesion. In (B), a predicted lesion is considered a true positive if its SUVmax is matched with the reference lesion’s SUVmax. (C) quantifies the agreement between model predictions and physician measurements for interim PET metrics. In the plots, actual metric values and Spearman's correlation values are marked by circles with error bars indicating 95$\%$ confidence intervals. LAS-Net showed significantly improved performance ($P<0.05$) over all comparator methods in F1 scores and interim PET metrics. The only exception was for qPET where its performance did not significantly surpass SegResNet with MPDR ($P=0.057$). SUVmax = maximum lesion standardized uptake value, $\Delta$SUVmax = percentage difference of SUVmax between the baseline and interim scans.
Figure 3: Performance comparison of baseline PET lesion segmentation in the internal cohort. (A) shows violin plots of evaluation metrics, where vertical lines represent the interquartile ranges and white circles mark the median values. (B) compares the correlations between baseline PET metrics assessed by physicians and those measured by deep learning models. Actual Spearman’s correlation values are marked by circles and their 95$\%$ confidence intervals are denoted by error bars. FPV = false positive volume, FNV = false negative volume, MTV = metabolic tumor volume, TLG = total lesion glycolysis, SUVmax = maximum lesion standardized uptake value, Dmax = maximum tumor dissemination, Dspleen = maximum distance between the lesion and the spleen.
Figure 4: Comparison of physician-based and automatically extracted PET metrics. Spearman’s $\rho$ correlations are shown in the top left corner of each plot. Correlation values are presented as mean [2.5th percentile, 97.5th percentile].
Figure 5: Nine different examples of longitudinally-aware segmentation network (LAS-Net) output. Each case has maximum intensity projections (MIPs) of baseline and interim PET images with overlaying MIPs of the reference and predicted lesion masks. DS = Deauville score.
...and 6 more figures

Automatic Quantification of Serial PET/CT Images for Pediatric Hodgkin Lymphoma Patients Using a Longitudinally-Aware Segmentation Network

TL;DR

Abstract

Automatic Quantification of Serial PET/CT Images for Pediatric Hodgkin Lymphoma Patients Using a Longitudinally-Aware Segmentation Network

Authors

TL;DR

Abstract

Table of Contents

Figures (11)