Table of Contents
Fetching ...

M$^2$AD: Multi-Sensor Multi-System Anomaly Detection through Global Scoring and Calibrated Thresholding

Sarah Alnegheimish, Zelin He, Matthew Reimherr, Akash Chandrayan, Abhinav Pradhan, Luca D'Angelo

TL;DR

M2AD tackles anomaly detection in heterogeneous multivariate time series across multiple systems by forecasting normal behavior with an LSTM, computing per-sensor residuals, and forming a global anomaly score $S_t$ through a Gaussian Mixture Model and Gamma-calibrated aggregation. The approach offers interpretability by identifying top-contributing sensors and provides theoretical guarantees on error quantification and p-value calibration under dependencies. Empirical results on NASA datasets (MSL, SMAP, SMD) show about 21% average improvements over baselines, and a real-world Amazon case study with 130 assets demonstrates practical impact, aided by covariates and robust thresholding. The work delivers a scalable, calibrated, and interpretable framework for industrial multi-sensor anomaly detection, with code and results shared publicly.

Abstract

With the widespread availability of sensor data across industrial and operational systems, we frequently encounter heterogeneous time series from multiple systems. Anomaly detection is crucial for such systems to facilitate predictive maintenance. However, most existing anomaly detection methods are designed for either univariate or single-system multivariate data, making them insufficient for these complex scenarios. To address this, we introduce M$^2$AD, a framework for unsupervised anomaly detection in multivariate time series data from multiple systems. M$^2$AD employs deep models to capture expected behavior under normal conditions, using the residuals as indicators of potential anomalies. These residuals are then aggregated into a global anomaly score through a Gaussian Mixture Model and Gamma calibration. We theoretically demonstrate that this framework can effectively address heterogeneity and dependencies across sensors and systems. Empirically, M$^2$AD outperforms existing methods in extensive evaluations by 21% on average, and its effectiveness is demonstrated on a large-scale real-world case study on 130 assets in Amazon Fulfillment Centers. Our code and results are available at https://github.com/sarahmish/M2AD.

M$^2$AD: Multi-Sensor Multi-System Anomaly Detection through Global Scoring and Calibrated Thresholding

TL;DR

M2AD tackles anomaly detection in heterogeneous multivariate time series across multiple systems by forecasting normal behavior with an LSTM, computing per-sensor residuals, and forming a global anomaly score through a Gaussian Mixture Model and Gamma-calibrated aggregation. The approach offers interpretability by identifying top-contributing sensors and provides theoretical guarantees on error quantification and p-value calibration under dependencies. Empirical results on NASA datasets (MSL, SMAP, SMD) show about 21% average improvements over baselines, and a real-world Amazon case study with 130 assets demonstrates practical impact, aided by covariates and robust thresholding. The work delivers a scalable, calibrated, and interpretable framework for industrial multi-sensor anomaly detection, with code and results shared publicly.

Abstract

With the widespread availability of sensor data across industrial and operational systems, we frequently encounter heterogeneous time series from multiple systems. Anomaly detection is crucial for such systems to facilitate predictive maintenance. However, most existing anomaly detection methods are designed for either univariate or single-system multivariate data, making them insufficient for these complex scenarios. To address this, we introduce MAD, a framework for unsupervised anomaly detection in multivariate time series data from multiple systems. MAD employs deep models to capture expected behavior under normal conditions, using the residuals as indicators of potential anomalies. These residuals are then aggregated into a global anomaly score through a Gaussian Mixture Model and Gamma calibration. We theoretically demonstrate that this framework can effectively address heterogeneity and dependencies across sensors and systems. Empirically, MAD outperforms existing methods in extensive evaluations by 21% on average, and its effectiveness is demonstrated on a large-scale real-world case study on 130 assets in Amazon Fulfillment Centers. Our code and results are available at https://github.com/sarahmish/M2AD.

Paper Structure

This paper contains 34 sections, 2 theorems, 29 equations, 6 figures, 8 tables.

Key Result

Proposition 1

Given the true model (eq:true_model) and any $\rho>0$ and $\sigma > 0$, we have where $c$ is a universal positive constant.

Figures (6)

  • Figure 1: Overview of M2AD workflow. The frameworks trains an LSTM to predict $\mathcal{Y}$, which models the expected pattern of $\mathcal{X}$. Then, M2AD calculates the errors ($\mathbf{e}$) between the observed and expected signal for each sensor. Lastly, we use a Gaussian Mixture Model to find p-values and construct a global anomaly score for a particular asset from all its sensor information.
  • Figure 2: Example of S-1 signal from SMAP dataset. (left) depiction of the original time series and predicted one. (center) error signal using point-wise difference. (right) error signal using area difference.
  • Figure 3: Performance under different error functions on MSL, SMAP, and SMD datasets.
  • Figure 4: (a) Accuracy of top contributing sensors to the anomaly score compared to a random selection in SMD. (b) Change in model performance without covariates (dark) and with covariates (light).
  • Figure 5: Hierarchical structure of data sources. We collect data from amperage and Monitron which contain sensory information (blue). We also leverage the throughput and availability as covariates (green).
  • ...and 1 more figures

Theorems & Definitions (2)

  • Proposition 1
  • Proposition 2