Table of Contents
Fetching ...

Finding Reproducible and Prognostic Radiomic Features in Variable Slice Thickness Contrast Enhanced CT of Colorectal Liver Metastases

Jacob J. Peoples, Mohammad Hamghalam, Imani James, Maida Wasim, Natalie Gangai, Hyunseon Christine Kang, X. John Rong, Yun Shin Chun, Richard K. G. Do, Amber L. Simpson

TL;DR

This paper investigates how radiomic features from contrast-enhanced CT for colorectal liver metastases behave under variable slice thickness and across multiple feature-extractor settings, assessing both reproducibility and prognostic value. Using a prospective 81-patient dataset for reproducibility and an independent 197-patient survival dataset, the authors extract 93 features across eight extractor configurations from two ROIs (largest tumor and liver parenchyma) and evaluate reproducibility with concordance correlation coefficients and survival performance with cross-validated Cox models. They find that reproducibility and prognostic utility depend on ROI and feature type, and that no single extractor is universally best; a data-driven approach that pools features across settings and applies reproducibility thresholds can achieve competitive prognostic performance (e.g., C-index up to 0.630) while reducing overfitting. The results underscore the value of integrating reproducibility metrics into feature selection and support using diverse extraction settings to build robust radiomic signatures for CRLM prognosis, while noting limitations such as fixed bin count and anisotropic voxel sizes. The work advances the field toward reproducible, prognostic radiomic biomarkers in CT for CRLM and provides a practical framework for multi-parameter feature selection.

Abstract

Establishing the reproducibility of radiomic signatures is a critical step in the path to clinical adoption of quantitative imaging biomarkers; however, radiomic signatures must also be meaningfully related to an outcome of clinical importance to be of value for personalized medicine. In this study, we analyze both the reproducibility and prognostic value of radiomic features extracted from the liver parenchyma and largest liver metastases in contrast enhanced CT scans of patients with colorectal liver metastases (CRLM). A prospective cohort of 81 patients from two major US cancer centers was used to establish the reproducibility of radiomic features extracted from images reconstructed with different slice thicknesses. A publicly available, single-center cohort of 197 preoperative scans from patients who underwent hepatic resection for treatment of CRLM was used to evaluate the prognostic value of features and models to predict overall survival. A standard set of 93 features was extracted from all images, with a set of eight different extractor settings. The feature extraction settings producing the most reproducible, as well as the most prognostically discriminative feature values were highly dependent on both the region of interest and the specific feature in question. While the best overall predictive model was produced using features extracted with a particular setting, without accounting for reproducibility, (C-index = 0.630 (0.603--0.649)) an equivalent-performing model (C-index = 0.629 (0.605--0.645)) was produced by pooling features from all extraction settings, and thresholding features with low reproducibility ($\mathrm{CCC} \geq 0.85$), prior to feature selection. Our findings support a data-driven approach to feature extraction and selection, preferring the inclusion of many features, and narrowing feature selection based on reproducibility when relevant data is available.

Finding Reproducible and Prognostic Radiomic Features in Variable Slice Thickness Contrast Enhanced CT of Colorectal Liver Metastases

TL;DR

This paper investigates how radiomic features from contrast-enhanced CT for colorectal liver metastases behave under variable slice thickness and across multiple feature-extractor settings, assessing both reproducibility and prognostic value. Using a prospective 81-patient dataset for reproducibility and an independent 197-patient survival dataset, the authors extract 93 features across eight extractor configurations from two ROIs (largest tumor and liver parenchyma) and evaluate reproducibility with concordance correlation coefficients and survival performance with cross-validated Cox models. They find that reproducibility and prognostic utility depend on ROI and feature type, and that no single extractor is universally best; a data-driven approach that pools features across settings and applies reproducibility thresholds can achieve competitive prognostic performance (e.g., C-index up to 0.630) while reducing overfitting. The results underscore the value of integrating reproducibility metrics into feature selection and support using diverse extraction settings to build robust radiomic signatures for CRLM prognosis, while noting limitations such as fixed bin count and anisotropic voxel sizes. The work advances the field toward reproducible, prognostic radiomic biomarkers in CT for CRLM and provides a practical framework for multi-parameter feature selection.

Abstract

Establishing the reproducibility of radiomic signatures is a critical step in the path to clinical adoption of quantitative imaging biomarkers; however, radiomic signatures must also be meaningfully related to an outcome of clinical importance to be of value for personalized medicine. In this study, we analyze both the reproducibility and prognostic value of radiomic features extracted from the liver parenchyma and largest liver metastases in contrast enhanced CT scans of patients with colorectal liver metastases (CRLM). A prospective cohort of 81 patients from two major US cancer centers was used to establish the reproducibility of radiomic features extracted from images reconstructed with different slice thicknesses. A publicly available, single-center cohort of 197 preoperative scans from patients who underwent hepatic resection for treatment of CRLM was used to evaluate the prognostic value of features and models to predict overall survival. A standard set of 93 features was extracted from all images, with a set of eight different extractor settings. The feature extraction settings producing the most reproducible, as well as the most prognostically discriminative feature values were highly dependent on both the region of interest and the specific feature in question. While the best overall predictive model was produced using features extracted with a particular setting, without accounting for reproducibility, (C-index = 0.630 (0.603--0.649)) an equivalent-performing model (C-index = 0.629 (0.605--0.645)) was produced by pooling features from all extraction settings, and thresholding features with low reproducibility (), prior to feature selection. Our findings support a data-driven approach to feature extraction and selection, preferring the inclusion of many features, and narrowing feature selection based on reproducibility when relevant data is available.
Paper Structure (20 sections, 1 equation, 8 figures, 5 tables)

This paper contains 20 sections, 1 equation, 8 figures, 5 tables.

Figures (8)

  • Figure 1: A histogram of the in-plane pixel spacing for images in our dataset. Pixel spacing was consistent across all reconstructions, so only counts for the reference reconstructions (5 mm slice thickness and 20% ASiR) are shown.
  • Figure 2: A histogram of the slice thickness for images in the survival data set. Slice thicknesses fell in the range $[0.8,7.5]$ mm.
  • Figure 3: Box plots of the features compared pairwise across slice thicknesses, broken down by feature extraction setting. The statistical significance of the change in CCC value between different pairs of slice thicknesses is listed below the plot, based on the results of a Wilcoxon sign-rank test.
  • Figure 4: Cluster maps of the CCC (A) and C-index (B) of all features. Each row corresponds to a unique feature, while each column is an extractor setting. For the CCC, the liver and tumor extractor results are joined, and clustered together, to emphasize the patterns of reproducibility across ROI and extractor. For the C-index, the liver and tumor results are clustered separately. The feature class for each row is indicated by the left-most column of each heat map.
  • Figure 5: (A)--(C): Bar plots counting for how many features each extractor produces the highest CCC (A), highest C-index (B), or a pair of CCC and C-index that is Pareto efficient for that feature (C). (D): A bar plot of how many features from each extractor are on the Pareto front for all features across all extractors. In (A)-(D), the line indicators show the number of features broken down by ROI (tumor or liver for left and right, respectively), while the bar height corresponds to the average across the two ROIs. (E): A scatter plot of the C-index and CCC for all features, color coded by ROI, with points on the Pareto front rendered with full opacity and circled in red.
  • ...and 3 more figures