Data Quality Monitoring for the Hadron Calorimeters Using Transfer Learning for Anomaly Detection
Mulugeta Weldezgina Asres, Christian Walter Omlin, Long Wang, Pavel Parygin, David Yu, Jay Dittmann, The CMS-HCAL Collaboration
TL;DR
This work tackles the challenge of data-scarce, high-dimensional spatio-temporal anomaly detection in the CMS Hadron Calorimeter by applying transfer learning to a GraphSTAD hybrid autoencoder (CNN+GNN+RNN). By transferring from the HCAL Endcap (HE) to the Barrel (HB) and exploring multiple TL configurations on both the encoder and decoder, the study demonstrates substantial reductions in trainable parameters and improved robustness to training data contamination while preserving ST reconstruction and anomaly detection. The results show that careful TL of spatial encoders and temporal decoders, along with state-preserving RNNs and learning-rate scheduling, yields notable gains in reconstruction accuracy and AD performance under limited target data. These findings offer practical guidance for deploying TL-enabled ST anomaly detection in large-scale detector monitoring and provide insights transferable to other ST data domains with similar structure and constraints.
Abstract
The proliferation of sensors brings an immense volume of spatio-temporal (ST) data in many domains, including monitoring, diagnostics, and prognostics applications. Data curation is a time-consuming process for a large volume of data, making it challenging and expensive to deploy data analytics platforms in new environments. Transfer learning (TL) mechanisms promise to mitigate data sparsity and model complexity by utilizing pre-trained models for a new task. Despite the triumph of TL in fields like computer vision and natural language processing, efforts on complex ST models for anomaly detection (AD) applications are limited. In this study, we present the potential of TL within the context of high-dimensional ST AD with a hybrid autoencoder architecture, incorporating convolutional, graph, and recurrent neural networks. Motivated by the need for improved model accuracy and robustness, particularly in scenarios with limited training data on systems with thousands of sensors, this research investigates the transferability of models trained on different sections of the Hadron Calorimeter of the Compact Muon Solenoid experiment at CERN. The key contributions of the study include exploring TL's potential and limitations within the context of encoder and decoder networks, revealing insights into model initialization and training configurations that enhance performance while substantially reducing trainable parameters and mitigating data contamination effects. Code: https://github.com/muleina/CMS\_HCAL\_ML\_OnlineDQM .
