Table of Contents
Fetching ...

ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation

Léo Machado, Hélène Philippe, Élodie Ferreres, Julien Khlaut, Julie Dupuis, Korentin Le Floch, Denis Habip Gatenyo, Pascal Roux, Jules Grégory, Maxime Ronot, Corentin Dancette, Tom Boeken, Daniel Tordjman, Pierre Manceron, Paul Hérent

TL;DR

ONCOPILOT introduces a promptable CT foundation model for solid tumor evaluation that delivers accurate 3D segmentation and RECIST measurements while enabling volumetric analytics. Built on a SAM-based architecture and trained on roughly 7,500 CT scans, it supports 2D visual prompts to generate 3D tumor masks and can be interactively refined by radiologists, achieving radiologist-level RECIST performance and outperforming a nnUnet baseline. The system accelerates measurements and reduces inter-reader variability, expanding beyond RECIST to enable volumetric biomarkers for better patient stratification. Limitations include dataset imbalance (e.g., many lung lesions in testing); future work will pursue balanced longitudinal validation and broader clinical integration.

Abstract

Carcinogenesis is a proteiform phenomenon, with tumors emerging in various locations and displaying complex, diverse shapes. At the crucial intersection of research and clinical practice, it demands precise and flexible assessment. However, current biomarkers, such as RECIST 1.1's long and short axis measurements, fall short of capturing this complexity, offering an approximate estimate of tumor burden and a simplistic representation of a more intricate process. Additionally, existing supervised AI models face challenges in addressing the variability in tumor presentations, limiting their clinical utility. These limitations arise from the scarcity of annotations and the models' focus on narrowly defined tasks. To address these challenges, we developed ONCOPILOT, an interactive radiological foundation model trained on approximately 7,500 CT scans covering the whole body, from both normal anatomy and a wide range of oncological cases. ONCOPILOT performs 3D tumor segmentation using visual prompts like point-click and bounding boxes, outperforming state-of-the-art models (e.g., nnUnet) and achieving radiologist-level accuracy in RECIST 1.1 measurements. The key advantage of this foundation model is its ability to surpass state-of-the-art performance while keeping the radiologist in the loop, a capability that previous models could not achieve. When radiologists interactively refine the segmentations, accuracy improves further. ONCOPILOT also accelerates measurement processes and reduces inter-reader variability, facilitating volumetric analysis and unlocking new biomarkers for deeper insights. This AI assistant is expected to enhance the precision of RECIST 1.1 measurements, unlock the potential of volumetric biomarkers, and improve patient stratification and clinical care, while seamlessly integrating into the radiological workflow.

ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation

TL;DR

ONCOPILOT introduces a promptable CT foundation model for solid tumor evaluation that delivers accurate 3D segmentation and RECIST measurements while enabling volumetric analytics. Built on a SAM-based architecture and trained on roughly 7,500 CT scans, it supports 2D visual prompts to generate 3D tumor masks and can be interactively refined by radiologists, achieving radiologist-level RECIST performance and outperforming a nnUnet baseline. The system accelerates measurements and reduces inter-reader variability, expanding beyond RECIST to enable volumetric biomarkers for better patient stratification. Limitations include dataset imbalance (e.g., many lung lesions in testing); future work will pursue balanced longitudinal validation and broader clinical integration.

Abstract

Carcinogenesis is a proteiform phenomenon, with tumors emerging in various locations and displaying complex, diverse shapes. At the crucial intersection of research and clinical practice, it demands precise and flexible assessment. However, current biomarkers, such as RECIST 1.1's long and short axis measurements, fall short of capturing this complexity, offering an approximate estimate of tumor burden and a simplistic representation of a more intricate process. Additionally, existing supervised AI models face challenges in addressing the variability in tumor presentations, limiting their clinical utility. These limitations arise from the scarcity of annotations and the models' focus on narrowly defined tasks. To address these challenges, we developed ONCOPILOT, an interactive radiological foundation model trained on approximately 7,500 CT scans covering the whole body, from both normal anatomy and a wide range of oncological cases. ONCOPILOT performs 3D tumor segmentation using visual prompts like point-click and bounding boxes, outperforming state-of-the-art models (e.g., nnUnet) and achieving radiologist-level accuracy in RECIST 1.1 measurements. The key advantage of this foundation model is its ability to surpass state-of-the-art performance while keeping the radiologist in the loop, a capability that previous models could not achieve. When radiologists interactively refine the segmentations, accuracy improves further. ONCOPILOT also accelerates measurement processes and reduces inter-reader variability, facilitating volumetric analysis and unlocking new biomarkers for deeper insights. This AI assistant is expected to enhance the precision of RECIST 1.1 measurements, unlock the potential of volumetric biomarkers, and improve patient stratification and clinical care, while seamlessly integrating into the radiological workflow.

Paper Structure

This paper contains 17 sections, 1 equation, 7 figures.

Figures (7)

  • Figure 1: ONCOPILOT Foundation Model Training and Evaluation (A) Overview of the datasets used for training the ONCOPILOT segmentation model, including the distribution across train, test, and validation sets. (B) Diagram illustrating the ONCOPILOT segmentation model's workflow. The model accepts visual prompts (either point-clicks or bounding boxes) of 3D tumor volumes and outputs corresponding 3D segmentation masks. Optional editing allows for real or simulated radiologist interaction, where positive and negative edit-points can be set manually in a viewer environment or automatically during evaluation.
  • Figure 2: ONCOPILOT Performance Against Baseline (A) Radar plot (top) and table (bottom) displaying segmentation DICE scores across 7 lesion types for 3 different ONCOPILOT models (point, point-edit, bbox) compared to the best-performing baseline from the ULS23 segmentation challenge on the 10% held-out test set. (B) Examples of successful segmentations from the test set, comparing point mode (left columns) and bbox mode (right columns). The top row shows the visual prompt provided to the model, the middle row displays the ground truth mask for that slice, and the bottom row presents the ONCOPILOT model’s predicted segmentation.
  • Figure 3: ONCOPILOT Performance on Different Lesion Types Bar plot showing the mean DICE scores from ONCOPILOT segmentation masks in point mode (red) and point-edit mode (blue) for: (A) spherical lesions (sphericity $> 0.6$) versus irregular lesions (see Methods for the sphericity formula), (B) large lesions (long axis $> 15$ mm) versus smaller lesions, (C) voluminous lesions (volume $> 1$ mL) versus smaller lesions. (D) Boxplot displaying the distribution of DICE scores produced by ONCOPILOT in point mode (red) and point-edit mode (blue) across various lesion types in the 10% held-out test set, with median values and interquartile ranges highlighted. (E) Boxplot showing RECIST measurements derived from ONCOPILOT’s predicted masks in point mode (red) and point-edit mode (blue) across different lesion types in the 10% held-out test set, highlighting median values and interquartile ranges. The long axis is defined as the longest possible line in the axial plane across the predicted 3D mask. ***: p-value $< 0.001$; n.s.: non-significant.
  • Figure 4: ONCOPILOT Integration Into Radiologist’s Workflow (A) Diagram and results comparing ONCOPILOT in point, point-edit, and bbox modes against three radiologists for the long-axis measurement of diverse oncological lesions. Median absolute error (mm) and median relative error (% of lesion size) are shown. P-values from t-tests compare ONCOPILOT models to radiologists for long-axis measurement error, without statistical significance p $\geq$ 0.05. The long axis is the longest line in the axial plane across the predicted 3D mask. (B) Boxplot (bottom) of ONCOPILOT’s tumors long-axis measurement performance against radiologists. Left: median absolute error (mm) vs. ground truth. Right: median relative error (% of lesion size). Median and interquartile ranges are shown. (C) Diagram of an experiment evaluating radiologists' inter-operator variability and measurement time while measuring tumors' long-axis using a digital viewer for manual vs. ONCOPILOT-assisted (bbox mode) long-axis assessments. (D) Boxplots show radiologists' inter-operator variability in measurement error (left) and measurement time (right) using manual vs. ONCOPILOT-assisted annotations across diverse tumors, with t-test p-values; n=3.
  • Figure S1: ONCOPILOT Long Axis Performance Across Different Organs (A) Table showing the mean and median long-axis measurements (in mm) for the various organ types in the test set. (B) Example of a suboptimal segmentation by ONCOPILOT on a small lung nodule from the LIDC-IDRI dataset with magnification of the overlay on the rightmost panel, with a DICE of 0.66 in point mode.
  • ...and 2 more figures