A Two-Stage Machine Learning-Aided Approach for Quench Identification at the European XFEL
Lynda Boukela, Annika Eichler, Julien Branlard, Nur Zulaiha Jomhari
TL;DR
This work tackles quench identification in the European XFEL by proposing a hybrid two-stage framework that first detects faults using a nonlinear parity-space residual and generalized likelihood ratio, then isolates quenches from other faults with two k-medoids clustering models based on Euclidean and DTW similarities. The approach leverages a physical SRFC model with detuning $Δω(t)$ and half-bandwidth $ω_{1/2}$ to generate residuals $r(t)$, and uses GLR statistics to trigger fault alarms. Two clustering models (EUC and DTW) are trained to recognize quench patterns, with EUC employing ellipsoidal decision boundaries and DTW using a cubic boundary to separate quenches from other faults. On 671 evaluation events, both EUC and DTW achieve ROC-AUC of 0.94, outperforming the current QDS (0.86) and demonstrating high true-positive rates with manageable false positives, highlighting practical potential for online fault management and reduced downtime at XFEL facilities.
Abstract
This paper introduces a machine learning-aided fault detection and isolation method applied to the case study of quench identification at the European X-Ray Free-Electron Laser. The plant utilizes 800 superconducting radio-frequency cavities in order to accelerate electron bunches to high energies of up to 17.5 GeV. Various faulty events can disrupt the nominal functioning of the accelerator, including quenches that can lead to a loss of the superconductivity of the cavities and the interruption of their operation. In this context, our solution consists in analyzing signals reflecting the dynamics of the cavities in a two-stage approach. (I) Fault detection that uses analytical redundancy to process the data and generate a residual. The evaluation of the residual through the generalized likelihood ratio allows detecting the faulty behaviors. (II) Fault isolation which involves the distinction of the quenches from the other faults. To this end, we proceed with a data-driven model of the k-medoids algorithm that explores different similarity measures, namely, the Euclidean and the dynamic time warping. Finally, we evaluate the new method and compare it to the currently deployed quench detection system, the results show the improved performance achieved by our method.
