Table of Contents
Fetching ...

Cryo-RL: automating prostate cancer cryoablation planning with reinforcement learning

Trixia Simangan, Ahmed Nadeem Abbasi, Yipeng Hu, Shaheer U. Saeed

TL;DR

Cryo-RL reframes prostate cancer cryoablation planning as a Markov decision process and learns a policy via reinforcement learning, trained in a clinically constrained simulator that models intraoperative variability. The method optimizes cryoprobe placements and ice-sphere diameters using a reward based on ice-tumor overlap while preserving healthy tissue, with PPO guiding policy updates. Evaluation on 583 retrospective PROMIS cases shows Cryo-RL outperforms automated baselines by more than 8 Dice points and matches human expert performance while dramatically reducing planning time, demonstrating potential for scalable, standardized cryoablation planning. The authors release their code to support reproducibility and future research in automated interventional planning.

Abstract

Cryoablation is a minimally invasive localised treatment for prostate cancer that destroys malignant tissue during de-freezing, while sparing surrounding healthy structures. Its success depends on accurate preoperative planning of cryoprobe placements to fully cover the tumour and avoid critical anatomy. This planning is currently manual, expertise-dependent, and time-consuming, leading to variability in treatment quality and limited scalability. In this work, we introduce Cryo-RL, a reinforcement learning framework that models cryoablation planning as a Markov decision process and learns an optimal policy for cryoprobe placement. Within a simulated environment that models clinical constraints and stochastic intraoperative variability, an agent sequentially selects cryoprobe positions and ice sphere diameters. Guided by a reward function based on tumour coverage, this agent learns a cryoablation strategy that leads to optimal cryoprobe placements without the need for any manually-designed plans. Evaluated on 583 retrospective prostate cancer cases, Cryo-RL achieved over 8 percentage-point Dice improvements compared with the best automated baselines, based on geometric optimisation, and matched human expert performance while requiring substantially less planning time. These results highlight the potential of reinforcement learning to deliver clinically viable, reproducible, and efficient cryoablation plans.

Cryo-RL: automating prostate cancer cryoablation planning with reinforcement learning

TL;DR

Cryo-RL reframes prostate cancer cryoablation planning as a Markov decision process and learns a policy via reinforcement learning, trained in a clinically constrained simulator that models intraoperative variability. The method optimizes cryoprobe placements and ice-sphere diameters using a reward based on ice-tumor overlap while preserving healthy tissue, with PPO guiding policy updates. Evaluation on 583 retrospective PROMIS cases shows Cryo-RL outperforms automated baselines by more than 8 Dice points and matches human expert performance while dramatically reducing planning time, demonstrating potential for scalable, standardized cryoablation planning. The authors release their code to support reproducibility and future research in automated interventional planning.

Abstract

Cryoablation is a minimally invasive localised treatment for prostate cancer that destroys malignant tissue during de-freezing, while sparing surrounding healthy structures. Its success depends on accurate preoperative planning of cryoprobe placements to fully cover the tumour and avoid critical anatomy. This planning is currently manual, expertise-dependent, and time-consuming, leading to variability in treatment quality and limited scalability. In this work, we introduce Cryo-RL, a reinforcement learning framework that models cryoablation planning as a Markov decision process and learns an optimal policy for cryoprobe placement. Within a simulated environment that models clinical constraints and stochastic intraoperative variability, an agent sequentially selects cryoprobe positions and ice sphere diameters. Guided by a reward function based on tumour coverage, this agent learns a cryoablation strategy that leads to optimal cryoprobe placements without the need for any manually-designed plans. Evaluated on 583 retrospective prostate cancer cases, Cryo-RL achieved over 8 percentage-point Dice improvements compared with the best automated baselines, based on geometric optimisation, and matched human expert performance while requiring substantially less planning time. These results highlight the potential of reinforcement learning to deliver clinically viable, reproducible, and efficient cryoablation plans.

Paper Structure

This paper contains 17 sections, 3 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: An overview of the Cryo-RL framework.
  • Figure 2: Cryo-RL ablation plans for first visit ($t=1$) showing prostate in green, lesion in red and ablation area in blue, with dots indicating the projection of the centre-points of the placed cryoprobes. The top shows 2D slices from the 3D volume overlaid onto the preoperative MR image slice and the bottom shows a 3D view.