Table of Contents
Fetching ...

Zero-Shot Large Language Model Agents for Fully Automated Radiotherapy Treatment Planning

Dongrong Yang, Xin Wu, Yibo Xie, Xinyi Li, Qiuwen Wu, Jackie Wu, Yang Sheng

TL;DR

The study demonstrates a zero-shot LLM-based agent that directly interfaces with the Eclipse TPS to autonomously perform inverse planning for IMRT in head-and-neck cancer. By decomposing planning into domain-agnostic tasks, applying chain-of-thought prompting, and using an arithmetic module to track deviations in a quadratic objective loss, the agent iteratively refines optimization constraints without prior plans or fine-tuning. Evaluated on 20 cases, GPT-4.1-WP produced plans with dosimetric endpoints comparable to clinical plans and showed superior conformity and targeted sparing, while ablations without optimization priors degraded performance. The approach, implemented entirely inside a commercial TPS and completing in under 5 minutes, suggests a practical, generalizable pathway to reduce planning variability and broaden AI-assisted radiotherapy planning adoption.

Abstract

Radiation therapy treatment planning is an iterative, expertise-dependent process, and the growing burden of cancer cases has made reliance on manual planning increasingly unsustainable, underscoring the need for automation. In this study, we propose a workflow that leverages a large language model (LLM)-based agent to navigate inverse treatment planning for intensity-modulated radiation therapy (IMRT). The LLM agent was implemented to directly interact with a clinical treatment planning system (TPS) to iteratively extract intermediate plan states and propose new constraint values to guide inverse optimization. The agent's decision-making process is informed by current observations and previous optimization attempts and evaluations, allowing for dynamic strategy refinement. The planning process was performed in a zero-shot inference setting, where the LLM operated without prior exposure to manually generated treatment plans and was utilized without any fine-tuning or task-specific training. The LLM-generated plans were evaluated on twenty head-and-neck cancer cases against clinical manual plans, with key dosimetric endpoints analyzed and reported. The LLM-generated plans achieved comparable organ-at-risk (OAR) sparing relative to clinical plans while demonstrating improved hot spot control (Dmax: 106.5% vs. 108.8%) and superior conformity (conformity index: 1.18 vs. 1.39 for boost PTV; 1.82 vs. 1.88 for primary PTV). This study demonstrates the feasibility of a zero-shot, LLM-driven workflow for automated IMRT treatment planning in a commercial TPS. The proposed approach provides a generalizable and clinically applicable solution that could reduce planning variability and support broader adoption of AI-based planning strategies.

Zero-Shot Large Language Model Agents for Fully Automated Radiotherapy Treatment Planning

TL;DR

The study demonstrates a zero-shot LLM-based agent that directly interfaces with the Eclipse TPS to autonomously perform inverse planning for IMRT in head-and-neck cancer. By decomposing planning into domain-agnostic tasks, applying chain-of-thought prompting, and using an arithmetic module to track deviations in a quadratic objective loss, the agent iteratively refines optimization constraints without prior plans or fine-tuning. Evaluated on 20 cases, GPT-4.1-WP produced plans with dosimetric endpoints comparable to clinical plans and showed superior conformity and targeted sparing, while ablations without optimization priors degraded performance. The approach, implemented entirely inside a commercial TPS and completing in under 5 minutes, suggests a practical, generalizable pathway to reduce planning variability and broaden AI-assisted radiotherapy planning adoption.

Abstract

Radiation therapy treatment planning is an iterative, expertise-dependent process, and the growing burden of cancer cases has made reliance on manual planning increasingly unsustainable, underscoring the need for automation. In this study, we propose a workflow that leverages a large language model (LLM)-based agent to navigate inverse treatment planning for intensity-modulated radiation therapy (IMRT). The LLM agent was implemented to directly interact with a clinical treatment planning system (TPS) to iteratively extract intermediate plan states and propose new constraint values to guide inverse optimization. The agent's decision-making process is informed by current observations and previous optimization attempts and evaluations, allowing for dynamic strategy refinement. The planning process was performed in a zero-shot inference setting, where the LLM operated without prior exposure to manually generated treatment plans and was utilized without any fine-tuning or task-specific training. The LLM-generated plans were evaluated on twenty head-and-neck cancer cases against clinical manual plans, with key dosimetric endpoints analyzed and reported. The LLM-generated plans achieved comparable organ-at-risk (OAR) sparing relative to clinical plans while demonstrating improved hot spot control (Dmax: 106.5% vs. 108.8%) and superior conformity (conformity index: 1.18 vs. 1.39 for boost PTV; 1.82 vs. 1.88 for primary PTV). This study demonstrates the feasibility of a zero-shot, LLM-driven workflow for automated IMRT treatment planning in a commercial TPS. The proposed approach provides a generalizable and clinically applicable solution that could reduce planning variability and support broader adoption of AI-based planning strategies.

Paper Structure

This paper contains 12 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: LLM-based agentic workflow for automatic inverse planning. The upper panel illustrates the conventional manual planning workflow, where a human planner iteratively reviews intermediate plan states and adjusts dose–volume constraints. The lower panel depicts the proposed LLM-driven agentic workflow, designed to mimic the manual process. Guided by clinical objectives, prior knowledge of optimization systems, and access to computational tools, the LLM leverages its general reasoning capability to analyze plan status and adapt constraints in a human-like manner. At each iteration, structured chain-of-thought reasoning is applied to enhance decision quality and constraint refinement
  • Figure 2: Distribution of dosimetric endpoints for GPT-4.1-WP--generated plans compared with clinical plans. CI: conformity index; HI: homogeneity index; $D_{50}$: median dose. Asterisks ($\ast$) indicate statistically significant differences between groups ($p < 0.05$).
  • Figure 3: Planning log for the example case. Left panel: trajectories of dose constraints (solid lines) and attained dosimetric endpoints (dashed lines with markers) across optimization steps. Right panels: evolution of DVHs. Step 0 shows the initial plan optimized with PTV constraints only. In subsequent steps, dashed lines indicate the DVHs from the previous step, solid lines represent the updated DVHs, and the shaded regions highlight inter-step DVH changes.
  • Figure 4: Comparison of isodose distributions between the GPT-4.1-WP–generated plan (left) and the clinical reference plan (right). Red segment: boost PTV; orange segment: primary PTV.