Longitudinal Assessment of Lung Lesion Burden in CT
Tejas Sudharshan Mathai, Benjamin Hou, Ronald M. Summers
TL;DR
The paper tackles longitudinal assessment of total lung lesion burden in CT by training two 3D nnUNet segmentation models (with and without anatomical priors) on the public UniToChest dataset to enable interval change tracking of tumor burden. It finds that a model without priors generally outperforms the priors-only model in detection and segmentation, though priors can improve volumetric agreement, with clinically significant lesions (>1 cm) achieving $71.3\%$ precision, $68.4\%$ sensitivity, and $69.8\%$ F1, and Dice $77.1\pm20.3$, HD $11.7\pm24.1$ mm. Median lesion burdens were around $6.4$ cc (noPriors) and $5.7$ cc (withPriors), with median manual vs automated volume differences near $0.02$ cc, highlighting the potential for personalized interval burden tracking. While promising for integrating automated burden assessments into clinical workflows and PACS, the study is limited by label noise in UniToChest and lack of direct clinical disease correlation, indicating a need for cleaner datasets and further clinical validation.
Abstract
In the U.S., lung cancer is the second major cause of death. Early detection of suspicious lung nodules is crucial for patient treatment planning, management, and improving outcomes. Many approaches for lung nodule segmentation and volumetric analysis have been proposed, but few have looked at longitudinal changes in total lung tumor burden. In this work, we trained two 3D models (nnUNet) with and without anatomical priors to automatically segment lung lesions and quantified total lesion burden for each patient. The 3D model without priors significantly outperformed ($p < .001$) the model trained with anatomy priors. For detecting clinically significant lesions $>$ 1cm, a precision of 71.3\%, sensitivity of 68.4\%, and F1-score of 69.8\% was achieved. For segmentation, a Dice score of 77.1 $\pm$ 20.3 and Hausdorff distance error of 11.7 $\pm$ 24.1 mm was obtained. The median lesion burden was 6.4 cc (IQR: 2.1, 18.1) and the median volume difference between manual and automated measurements was 0.02 cc (IQR: -2.8, 1.2). Agreements were also evaluated with linear regression and Bland-Altman plots. The proposed approach can produce a personalized evaluation of the total tumor burden for a patient and facilitate interval change tracking over time.
