Table of Contents
Fetching ...

First Experimental Demonstration of Machine Learning-Based Tuning on the PSI Injector 2 Cyclotron

M. Haj Tahar, W. Joho, E. Solodko, M. Bocchio, S. Marquie, M. Busch, A. Barchetti, J. Grillenberger, J. Snuverink, M. Schneider

TL;DR

This work demonstrates the first experimental deployment of reinforcement learning to autonomously tune a high-power cyclotron (Injector 2 at PSI), achieving fast convergence and drift compensation across multiple turns and configurations. By integrating a TD3 agent with physics-informed reward shaping and an overshoot-based acceleration of magnetic settling, the study shows reliable autonomous operation within safety constraints and minor interlocks, including overnight stability and high-current scalability up to 800 µA. The results provide evidence for turn- and current-dependent dynamics requiring localized policy adaptation, with surrogate pretraining offering benefits in matched regimes but potentially biasing adaptation in degraded configurations. Overall, the approach advances ML-assisted control for ADS-class accelerators, outlining a concrete path for scaling toward megawatt-class drivers and robust fault-tolerant operation.

Abstract

Reliable operation of high-power proton cyclotrons is a critical requirement for Accelerator Driven Systems (ADS) and other large-scale applications. Beam tuning in such machines is traditionally performed manually, a process that can be slow, non-optimal, and difficult to execute in the presence of faults or changing conditions. To address this, we developed and deployed a machine learning (ML) based tuning framework on the Injector 2 cyclotron at PSI, chosen as an ideal testbed for high-power operation. The system combined a tailored reinforcement learning algorithm with real-time diagnostics and control, and incorporated accelerator-physics inspired adaptations such as an overshoot strategy that reduced magnetic field settling times by nearly a factor of six. Over an extensive 12-day operational test campaign, relatively long in the context of real-time ML experiments, the ML agent successfully tuned the machine across multiple operating points, achieving convergence within hours and maintaining stable beam extraction with reduced losses. Beyond initial tuning, the system was also operated in evaluation mode overnight, where it autonomously monitored and corrected the machine to compensate for drifts, demonstrating robustness and long-term stability. Crucially, the learned policy generalized reliably from low-current training to higher-current operation, underscoring its scalability. These results constitute the first demonstration of ML-assisted tuning on a high-power cyclotron, with direct relevance to ADS-class drivers.

First Experimental Demonstration of Machine Learning-Based Tuning on the PSI Injector 2 Cyclotron

TL;DR

This work demonstrates the first experimental deployment of reinforcement learning to autonomously tune a high-power cyclotron (Injector 2 at PSI), achieving fast convergence and drift compensation across multiple turns and configurations. By integrating a TD3 agent with physics-informed reward shaping and an overshoot-based acceleration of magnetic settling, the study shows reliable autonomous operation within safety constraints and minor interlocks, including overnight stability and high-current scalability up to 800 µA. The results provide evidence for turn- and current-dependent dynamics requiring localized policy adaptation, with surrogate pretraining offering benefits in matched regimes but potentially biasing adaptation in degraded configurations. Overall, the approach advances ML-assisted control for ADS-class accelerators, outlining a concrete path for scaling toward megawatt-class drivers and robust fault-tolerant operation.

Abstract

Reliable operation of high-power proton cyclotrons is a critical requirement for Accelerator Driven Systems (ADS) and other large-scale applications. Beam tuning in such machines is traditionally performed manually, a process that can be slow, non-optimal, and difficult to execute in the presence of faults or changing conditions. To address this, we developed and deployed a machine learning (ML) based tuning framework on the Injector 2 cyclotron at PSI, chosen as an ideal testbed for high-power operation. The system combined a tailored reinforcement learning algorithm with real-time diagnostics and control, and incorporated accelerator-physics inspired adaptations such as an overshoot strategy that reduced magnetic field settling times by nearly a factor of six. Over an extensive 12-day operational test campaign, relatively long in the context of real-time ML experiments, the ML agent successfully tuned the machine across multiple operating points, achieving convergence within hours and maintaining stable beam extraction with reduced losses. Beyond initial tuning, the system was also operated in evaluation mode overnight, where it autonomously monitored and corrected the machine to compensate for drifts, demonstrating robustness and long-term stability. Crucially, the learned policy generalized reliably from low-current training to higher-current operation, underscoring its scalability. These results constitute the first demonstration of ML-assisted tuning on a high-power cyclotron, with direct relevance to ADS-class drivers.

Paper Structure

This paper contains 28 sections, 4 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: Injector 2 Cyclotron experiment setup with key diagnostics.
  • Figure 2: Phase Breakdown of Inj 2 Cyclotron Experiment; the experiment took place from Apr 28, 2025 until May 8th, 2025 i.e., for a total of 11 days.
  • Figure 3: (a) Average radial beam size at extraction versus current, for various turn numbers (averaged over the last three turns). (b) Comparison of the measured average $\sigma$, obtained by averaging over all turn numbers shown in (a) (mean $\pm 2\sigma$), with space-charge tracking simulations.
  • Figure 4: Sensitivity of phases to AIHS change with respect to the turn number, for a perturbation amplitude of 0.02 Amp.
  • Figure 5: MIF8 phase response following a +0.05 A step increase in AIHS current without overshooting. The system exhibits a slow exponential settling, requiring nearly 60 seconds to stabilize (turn 72).
  • ...and 11 more figures