First Experimental Demonstration of Machine Learning-Based Tuning on the PSI Injector 2 Cyclotron
M. Haj Tahar, W. Joho, E. Solodko, M. Bocchio, S. Marquie, M. Busch, A. Barchetti, J. Grillenberger, J. Snuverink, M. Schneider
TL;DR
This work demonstrates the first experimental deployment of reinforcement learning to autonomously tune a high-power cyclotron (Injector 2 at PSI), achieving fast convergence and drift compensation across multiple turns and configurations. By integrating a TD3 agent with physics-informed reward shaping and an overshoot-based acceleration of magnetic settling, the study shows reliable autonomous operation within safety constraints and minor interlocks, including overnight stability and high-current scalability up to 800 µA. The results provide evidence for turn- and current-dependent dynamics requiring localized policy adaptation, with surrogate pretraining offering benefits in matched regimes but potentially biasing adaptation in degraded configurations. Overall, the approach advances ML-assisted control for ADS-class accelerators, outlining a concrete path for scaling toward megawatt-class drivers and robust fault-tolerant operation.
Abstract
Reliable operation of high-power proton cyclotrons is a critical requirement for Accelerator Driven Systems (ADS) and other large-scale applications. Beam tuning in such machines is traditionally performed manually, a process that can be slow, non-optimal, and difficult to execute in the presence of faults or changing conditions. To address this, we developed and deployed a machine learning (ML) based tuning framework on the Injector 2 cyclotron at PSI, chosen as an ideal testbed for high-power operation. The system combined a tailored reinforcement learning algorithm with real-time diagnostics and control, and incorporated accelerator-physics inspired adaptations such as an overshoot strategy that reduced magnetic field settling times by nearly a factor of six. Over an extensive 12-day operational test campaign, relatively long in the context of real-time ML experiments, the ML agent successfully tuned the machine across multiple operating points, achieving convergence within hours and maintaining stable beam extraction with reduced losses. Beyond initial tuning, the system was also operated in evaluation mode overnight, where it autonomously monitored and corrected the machine to compensate for drifts, demonstrating robustness and long-term stability. Crucially, the learned policy generalized reliably from low-current training to higher-current operation, underscoring its scalability. These results constitute the first demonstration of ML-assisted tuning on a high-power cyclotron, with direct relevance to ADS-class drivers.
