Table of Contents
Fetching ...

Automated radiotherapy treatment planning guided by GPT-4Vision

Sheng Liu, Oscar Pastor-Serrano, Yizheng Chen, Matthew Gopaulchan, Weixing Liang, Mark Buyyounouski, Erqi Pollom, Quynh-Thu Le, Michael Gensheimer, Peng Dong, Yong Yang, James Zou, Lei Xing

TL;DR

This work tackles the time-consuming, subjective nature of radiotherapy treatment planning by introducing GPT-RadPlan, a GPT-4V–driven framework that acts as both evaluator and planner through in-context learning. It frames planning as a two-loop optimization, where the inner loop performs fluence-map optimization and the outer loop tunes objective weights, guided by three LLM-driven modules (evaluation, memory, planning). Across 17 prostate and 13 head & neck VMAT plans, GPT-RadPlan matches or exceeds clinical plans, achieving improved target coverage and reduced organ-at-risk doses on average, with 3–6 iterations taking roughly 2–3 hours. The approach requires no domain-specific model training and operates with clinical protocols and reference plans, offering a promising, workflow-embedded copilot for radiotherapy planning while acknowledging limitations such as lack of Pareto guarantees and dependence on input prompts and TPS compatibility.

Abstract

Objective: Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in frontier Artificial Intelligence (AI) models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, an automated treatment planning framework that integrates radiation oncology knowledge with the reasoning capabilities of large multi-modal models, such as GPT-4Vision (GPT-4V) from OpenAI. Approach: Via in-context learning, we incorporate clinical requirements and a few (3 in our experiments) approved clinical plans with their optimization settings, enabling GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan system is integrated into our in-house inverse treatment planning system through an application programming interface (API). For a given patient, GPT-RadPlan acts as both plan evaluator and planner, first assessing dose distributions and dose-volume histograms (DVHs), and then providing textual feedback on how to improve the plan to match the physician's requirements. In this manner, GPT-RadPlan iteratively refines the plan by adjusting planning parameters, such as weights and dose objectives, based on its suggestions. Main results: The efficacy of the automated planning system is showcased across 17 prostate cancer and 13 head and neck cancer VMAT plans with prescribed doses of 70.2 Gy and 72 Gy, respectively, where we compared GPT-RadPlan results to clinical plans produced by human experts. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and reducing organ-at-risk doses by 5 Gy on average (15 percent for prostate and 10-15 percent for head and neck).

Automated radiotherapy treatment planning guided by GPT-4Vision

TL;DR

This work tackles the time-consuming, subjective nature of radiotherapy treatment planning by introducing GPT-RadPlan, a GPT-4V–driven framework that acts as both evaluator and planner through in-context learning. It frames planning as a two-loop optimization, where the inner loop performs fluence-map optimization and the outer loop tunes objective weights, guided by three LLM-driven modules (evaluation, memory, planning). Across 17 prostate and 13 head & neck VMAT plans, GPT-RadPlan matches or exceeds clinical plans, achieving improved target coverage and reduced organ-at-risk doses on average, with 3–6 iterations taking roughly 2–3 hours. The approach requires no domain-specific model training and operates with clinical protocols and reference plans, offering a promising, workflow-embedded copilot for radiotherapy planning while acknowledging limitations such as lack of Pareto guarantees and dependence on input prompts and TPS compatibility.

Abstract

Objective: Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in frontier Artificial Intelligence (AI) models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, an automated treatment planning framework that integrates radiation oncology knowledge with the reasoning capabilities of large multi-modal models, such as GPT-4Vision (GPT-4V) from OpenAI. Approach: Via in-context learning, we incorporate clinical requirements and a few (3 in our experiments) approved clinical plans with their optimization settings, enabling GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan system is integrated into our in-house inverse treatment planning system through an application programming interface (API). For a given patient, GPT-RadPlan acts as both plan evaluator and planner, first assessing dose distributions and dose-volume histograms (DVHs), and then providing textual feedback on how to improve the plan to match the physician's requirements. In this manner, GPT-RadPlan iteratively refines the plan by adjusting planning parameters, such as weights and dose objectives, based on its suggestions. Main results: The efficacy of the automated planning system is showcased across 17 prostate cancer and 13 head and neck cancer VMAT plans with prescribed doses of 70.2 Gy and 72 Gy, respectively, where we compared GPT-RadPlan results to clinical plans produced by human experts. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and reducing organ-at-risk doses by 5 Gy on average (15 percent for prostate and 10-15 percent for head and neck).
Paper Structure (19 sections, 3 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 19 sections, 3 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview of GPT-RadPlan workflow. a. Integration of multi-modal LLM-based RT treatment planning into the existing clinical workflow, using GPT-4V. GPT-RadPlan can be human-in-the-loop where the users can use human language to provide feedback on how to improve the current plan when needed, and iteratively adjust optimization weights. b. Modules for the application of GPT-4V at different steps of the RT workflow. Based on the independent evaluation of the dose distribution by an image AI expert, and the DVH assessment from the DVH AI expert, GPT-4V reviews the plan, providing feedback to the AI planner, and approves the plan after the plan is good enough. The planning module translates feedback from the evaluation module into concrete optimization parameters, enabling iterative refinement of the treatment plan. GPT-4V's in-context learning capabilities facilitate this process without requiring additional model training, relying instead on input prompts and historical planning data.
  • Figure 2: DVH comparison. Visual comparison of the DVHs for (left) GPT-RadPlan plans, and (right) clinical plans. Solid lines show the mean values, while the shaded bands indicate the standard deviation. For the OARs, better plans are usually characterized by lines that are close to the bottom left corner, implying greater OAR sparing.
  • Figure 3: Average DVH comparison. Visual comparison of the average DVH lines across all prostate (left) and lung (right) patients. Solid lines represent GPT-RadPlan plans, while dashed lines indicate clinical plans.
  • Figure 4: Differences between GPT-RadPlan and clinical plans. For every relevant OAR, each box displays the relative difference between clinical and GPT-RadPlan plans of all the patients in the prostate or head & neck cohorts. Each panel shows the relative difference between different OAR metrics, including (left) the mean dose, (center) the $D_{50}$ and (right) the $V_{15}$. Positive values above the red line indicate that GPT-RadPlan plans result in lower metric values, which is preferred.
  • Figure 5: Planning trajectories. From left to right, each column shows the evolution of the dose distribution (2D slice centered at the iso-center) and the corresponding DVH. The top figure (a) shows one of the prostate cases requiring only 3 steps to obtain a treatment plan that meets the clinical protocol requirements, while (b) depicts the trajectory for a head & neck case requiring 6 optimization steps. As the iteration number increases, GPT-RadPlan first ensures homogeneous PTV coverage, subsequently reducing the dose delivered to the OARs.
  • ...and 4 more figures