Continual Learning via Ensemble-Based Depth-Wise Masked Autoencoders for Data Quality Monitoring in High-Energy Physics
Dale Julson, Eric Reinhardt, Andrii Krutsylo, Resham Sohal, Guillermo Fidalgo, Sergei Gleyzer, Emanuele Usai, The CMS HCAL Collaboration
TL;DR
This work introduces DepthViT, a lightweight masked autoencoder architecture that employs unique depth-wise embeddings and cross-depth attention, to perform computationally efficient AD tasks and presents a path toward adaptive anomaly detection systems capable of sustained operation in dynamic data environments.
Abstract
Machine learning (ML) techniques have been demonstrated to improve the accuracy and efficiency of anomaly detection (AD) when compared to conventional methods. This has led to the adoption of ML for data quality monitoring (DQM) use cases in order to monitor the operation of certain systems to ensure that they are free of undesirable or potentially deleterious anomalies. For applications in the field of High-Energy physics (HEP), where detectors must operate in long-running, harsh environments, ML models used in DQM that have been trained on static datasets are bound to experience degraded performance due to distributional shifts that naturally occur in the incoming data streams, unless directly mitigated via the inclusion of continual ML techniques. This work introduces DepthViT, a lightweight masked autoencoder architecture that employs unique depth-wise embeddings and cross-depth attention, to perform computationally efficient AD tasks. A continual learning framework is developed in which DepthViT models trained on the most recent data streams are ensembled with older models to create a robust overall system which is more resilient to shifts in incoming data streams. When evaluated on occupancy maps from the Compact Muon Solenoid (CMS) hadron calorimeter across multiple data-taking campaigns, the proposed method maintains precision above 99\% and stable ratio of correct anomaly predictions to number of anomalies both under small and large distributional shifts. Beyond HEP, the same ensembling-based continual adaptation strategy can be directly applied to industrial monitoring environments where data also naturally evolve over time. This work therefore presents a path toward adaptive anomaly detection systems capable of sustained operation in dynamic data environments.
