Table of Contents
Fetching ...

Longitudinal Assessment of Lung Lesion Burden in CT

Tejas Sudharshan Mathai, Benjamin Hou, Ronald M. Summers

TL;DR

The paper tackles longitudinal assessment of total lung lesion burden in CT by training two 3D nnUNet segmentation models (with and without anatomical priors) on the public UniToChest dataset to enable interval change tracking of tumor burden. It finds that a model without priors generally outperforms the priors-only model in detection and segmentation, though priors can improve volumetric agreement, with clinically significant lesions (>1 cm) achieving $71.3\%$ precision, $68.4\%$ sensitivity, and $69.8\%$ F1, and Dice $77.1\pm20.3$, HD $11.7\pm24.1$ mm. Median lesion burdens were around $6.4$ cc (noPriors) and $5.7$ cc (withPriors), with median manual vs automated volume differences near $0.02$ cc, highlighting the potential for personalized interval burden tracking. While promising for integrating automated burden assessments into clinical workflows and PACS, the study is limited by label noise in UniToChest and lack of direct clinical disease correlation, indicating a need for cleaner datasets and further clinical validation.

Abstract

In the U.S., lung cancer is the second major cause of death. Early detection of suspicious lung nodules is crucial for patient treatment planning, management, and improving outcomes. Many approaches for lung nodule segmentation and volumetric analysis have been proposed, but few have looked at longitudinal changes in total lung tumor burden. In this work, we trained two 3D models (nnUNet) with and without anatomical priors to automatically segment lung lesions and quantified total lesion burden for each patient. The 3D model without priors significantly outperformed ($p < .001$) the model trained with anatomy priors. For detecting clinically significant lesions $>$ 1cm, a precision of 71.3\%, sensitivity of 68.4\%, and F1-score of 69.8\% was achieved. For segmentation, a Dice score of 77.1 $\pm$ 20.3 and Hausdorff distance error of 11.7 $\pm$ 24.1 mm was obtained. The median lesion burden was 6.4 cc (IQR: 2.1, 18.1) and the median volume difference between manual and automated measurements was 0.02 cc (IQR: -2.8, 1.2). Agreements were also evaluated with linear regression and Bland-Altman plots. The proposed approach can produce a personalized evaluation of the total tumor burden for a patient and facilitate interval change tracking over time.

Longitudinal Assessment of Lung Lesion Burden in CT

TL;DR

The paper tackles longitudinal assessment of total lung lesion burden in CT by training two 3D nnUNet segmentation models (with and without anatomical priors) on the public UniToChest dataset to enable interval change tracking of tumor burden. It finds that a model without priors generally outperforms the priors-only model in detection and segmentation, though priors can improve volumetric agreement, with clinically significant lesions (>1 cm) achieving precision, sensitivity, and F1, and Dice , HD mm. Median lesion burdens were around cc (noPriors) and cc (withPriors), with median manual vs automated volume differences near cc, highlighting the potential for personalized interval burden tracking. While promising for integrating automated burden assessments into clinical workflows and PACS, the study is limited by label noise in UniToChest and lack of direct clinical disease correlation, indicating a need for cleaner datasets and further clinical validation.

Abstract

In the U.S., lung cancer is the second major cause of death. Early detection of suspicious lung nodules is crucial for patient treatment planning, management, and improving outcomes. Many approaches for lung nodule segmentation and volumetric analysis have been proposed, but few have looked at longitudinal changes in total lung tumor burden. In this work, we trained two 3D models (nnUNet) with and without anatomical priors to automatically segment lung lesions and quantified total lesion burden for each patient. The 3D model without priors significantly outperformed () the model trained with anatomy priors. For detecting clinically significant lesions 1cm, a precision of 71.3\%, sensitivity of 68.4\%, and F1-score of 69.8\% was achieved. For segmentation, a Dice score of 77.1 20.3 and Hausdorff distance error of 11.7 24.1 mm was obtained. The median lesion burden was 6.4 cc (IQR: 2.1, 18.1) and the median volume difference between manual and automated measurements was 0.02 cc (IQR: -2.8, 1.2). Agreements were also evaluated with linear regression and Bland-Altman plots. The proposed approach can produce a personalized evaluation of the total tumor burden for a patient and facilitate interval change tracking over time.

Paper Structure

This paper contains 9 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Flowchart of the lung lesion segmentation pipeline and assessment of interval changes in total lesion burden. A 3D full-resolution nnUNet was trained to segment lung lesions (red) in patients with longitudinal exams from the public UniToChest dataset. As shown in the graph, a personalized estimate of the total lung lesion burden for a patient (ID #30) was obtained for interval change tracking over time.
  • Figure 2: Dice scores (a) and Hausdorff distance (HD) errors (d) for lung lesions $>$ 1cm. Correlation of reference vs. predicted lesion volumes for lesions segmented by 3D nnUNet model without (b) and with anatomy priors (e) for all test CT data. Total lung lesion burden over time for two patients (c, f) as computed by 3D nnUNet trained without priors.
  • Figure 3: Dice scores (unitless, top row) and Hausdorff distance (HD) errors (mm, bottom row) for the segmentation of lung nodules of different sizes by the 3D full-resolution nnUNet models trained with (purple) and without (gold) anatomical priors.
  • Figure 4: Volumetric agreement between the ground-truth lung nodule annotations and the automated measurements. Column (a) shows the linear regression plots while column (b) shows the Bland-Altman plots. The gray bands in the scatter plots denote the 95% confidence intervals (CI) for the best fit regression line (solid blue line). The upper and lower lines in column (b) represent the 95% CI.
  • Figure 5: The trajectories of total lung tumor burden for four different patients computed from the segmentations of the 3D model trained with no anatomical priors. The ages at the time of the studies (initial and follow-up) were not provided in the UniToChest dataset. For patients 51 and 366 in (a) and (b) respectively, the total tumor burden increased over time. A dramatic increase in the tumor burden was seen especially for patient 366. In (c), the tumor burden for patient 533 at the first and last visits were greater than the ground truth annotation, and this was due to false positive predictions. Finally, for patient 606 in (d), the tumor burden rose steadily across the visits.