Table of Contents
Fetching ...

Continual Learning via Ensemble-Based Depth-Wise Masked Autoencoders for Data Quality Monitoring in High-Energy Physics

Dale Julson, Eric Reinhardt, Andrii Krutsylo, Resham Sohal, Guillermo Fidalgo, Sergei Gleyzer, Emanuele Usai, The CMS HCAL Collaboration

TL;DR

This work introduces DepthViT, a lightweight masked autoencoder architecture that employs unique depth-wise embeddings and cross-depth attention, to perform computationally efficient AD tasks and presents a path toward adaptive anomaly detection systems capable of sustained operation in dynamic data environments.

Abstract

Machine learning (ML) techniques have been demonstrated to improve the accuracy and efficiency of anomaly detection (AD) when compared to conventional methods. This has led to the adoption of ML for data quality monitoring (DQM) use cases in order to monitor the operation of certain systems to ensure that they are free of undesirable or potentially deleterious anomalies. For applications in the field of High-Energy physics (HEP), where detectors must operate in long-running, harsh environments, ML models used in DQM that have been trained on static datasets are bound to experience degraded performance due to distributional shifts that naturally occur in the incoming data streams, unless directly mitigated via the inclusion of continual ML techniques. This work introduces DepthViT, a lightweight masked autoencoder architecture that employs unique depth-wise embeddings and cross-depth attention, to perform computationally efficient AD tasks. A continual learning framework is developed in which DepthViT models trained on the most recent data streams are ensembled with older models to create a robust overall system which is more resilient to shifts in incoming data streams. When evaluated on occupancy maps from the Compact Muon Solenoid (CMS) hadron calorimeter across multiple data-taking campaigns, the proposed method maintains precision above 99\% and stable ratio of correct anomaly predictions to number of anomalies both under small and large distributional shifts. Beyond HEP, the same ensembling-based continual adaptation strategy can be directly applied to industrial monitoring environments where data also naturally evolve over time. This work therefore presents a path toward adaptive anomaly detection systems capable of sustained operation in dynamic data environments.

Continual Learning via Ensemble-Based Depth-Wise Masked Autoencoders for Data Quality Monitoring in High-Energy Physics

TL;DR

This work introduces DepthViT, a lightweight masked autoencoder architecture that employs unique depth-wise embeddings and cross-depth attention, to perform computationally efficient AD tasks and presents a path toward adaptive anomaly detection systems capable of sustained operation in dynamic data environments.

Abstract

Machine learning (ML) techniques have been demonstrated to improve the accuracy and efficiency of anomaly detection (AD) when compared to conventional methods. This has led to the adoption of ML for data quality monitoring (DQM) use cases in order to monitor the operation of certain systems to ensure that they are free of undesirable or potentially deleterious anomalies. For applications in the field of High-Energy physics (HEP), where detectors must operate in long-running, harsh environments, ML models used in DQM that have been trained on static datasets are bound to experience degraded performance due to distributional shifts that naturally occur in the incoming data streams, unless directly mitigated via the inclusion of continual ML techniques. This work introduces DepthViT, a lightweight masked autoencoder architecture that employs unique depth-wise embeddings and cross-depth attention, to perform computationally efficient AD tasks. A continual learning framework is developed in which DepthViT models trained on the most recent data streams are ensembled with older models to create a robust overall system which is more resilient to shifts in incoming data streams. When evaluated on occupancy maps from the Compact Muon Solenoid (CMS) hadron calorimeter across multiple data-taking campaigns, the proposed method maintains precision above 99\% and stable ratio of correct anomaly predictions to number of anomalies both under small and large distributional shifts. Beyond HEP, the same ensembling-based continual adaptation strategy can be directly applied to industrial monitoring environments where data also naturally evolve over time. This work therefore presents a path toward adaptive anomaly detection systems capable of sustained operation in dynamic data environments.
Paper Structure (16 sections, 16 equations, 9 figures, 6 tables)

This paper contains 16 sections, 16 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: (a) The CMS detector is comprised of four sub-detectors in addition to a 3.8T superconducting solenoid magnet. Each sub-detector specializes in the detection of different particle species originating from collision interactions. (b) The HCAL sub-detector consists of multiple regions, including the HE and HB regions. The depth and $i\eta$ and $i\phi$ (discretely index pseudorapidity and azimuthal angle) values of these regions are shown here CMSDetector. The HCAL is symmetric in $\phi$.
  • Figure 2: Mean DigiOccupancy versus LS. The red vertical line delineates data originating from 2018 (left) and data originating 2022 (right). The data is temporally ordered both within and between runs, however it should not be assumed that run data is recorded in equally spaced intervals of time.
  • Figure 3: (a) DigiOccupancy map corresponding to Run325117, LS 0 (2018). Note the large absence of DigiOccupancy values in the upper left hand corner which resulted from a noted detector failure. (b) DigiOccupancy map corresponding to Run355456, LS 0 (2022).
  • Figure 4: DepthViT variational autoencoder architecture which uses depthwise convolutional embeddings and depth-wise attention, latent space resampling of the encoder outputs.
  • Figure 5: (a) Traditional convolution procedure used within vision transformer architectures, where $f$ represents the shared kernel filters. (b) convolution procedure used within the DepthViT architecture, where $k$ and $k'$ represent the filters derived separately for each channel. (c) Traditional attention mechanism used within vision transformer architectures where the attention mechanism attends along the input sequence. (d) Depthwise attention mechanism used within the DepthViT architecture where the attention mechanism instead attends in a depthwise manner.
  • ...and 4 more figures