Table of Contents
Fetching ...

AutoML for Multi-Class Anomaly Compensation of Sensor Drift

Melanie Schaller, Mathis Kruse, Antonio Ortega, Marius Lindauer, Bodo Rosenhahn

TL;DR

Sensor drift degrades ML performance over time, and conventional cross-validation can misrepresent this effect due to temporal leakage. The authors propose a two-pronged solution: a novel sensor drift compensation training paradigm and AutoML-DC, which combines anomaly-detection-inspired training with incremental batch learning and automates model/feature/hyperparameter selection using AutoML. They formalize the drift problem across chronological batches and optimize a robust ensemble via CASH (as implemented in auto-sklearn), reporting superior F1 and AUC-ROC on Vergara’s real-world drift dataset and demonstrating strong online-adaptation capabilities. The results show substantial performance gains, underscoring the practical impact for industrial sensor systems, while noting limitations such as dataset generalizability and the need for additional real-world data and unsupervised extensions for broader applicability.

Abstract

Addressing sensor drift is essential in industrial measurement systems, where precise data output is necessary for maintaining accuracy and reliability in monitoring processes, as it progressively degrades the performance of machine learning models over time. Our findings indicate that the standard cross-validation method used in existing model training overestimates performance by inadequately accounting for drift. This is primarily because typical cross-validation techniques allow data instances to appear in both training and testing sets, thereby distorting the accuracy of the predictive evaluation. As a result, these models are unable to precisely predict future drift effects, compromising their ability to generalize and adapt to evolving data conditions. This paper presents two solutions: (1) a novel sensor drift compensation learning paradigm for validating models, and (2) automated machine learning (AutoML) techniques to enhance classification performance and compensate sensor drift. By employing strategies such as data balancing, meta-learning, automated ensemble learning, hyperparameter optimization, feature selection, and boosting, our AutoML-DC (Drift Compensation) model significantly improves classification performance against sensor drift. AutoML-DC further adapts effectively to varying drift severities.

AutoML for Multi-Class Anomaly Compensation of Sensor Drift

TL;DR

Sensor drift degrades ML performance over time, and conventional cross-validation can misrepresent this effect due to temporal leakage. The authors propose a two-pronged solution: a novel sensor drift compensation training paradigm and AutoML-DC, which combines anomaly-detection-inspired training with incremental batch learning and automates model/feature/hyperparameter selection using AutoML. They formalize the drift problem across chronological batches and optimize a robust ensemble via CASH (as implemented in auto-sklearn), reporting superior F1 and AUC-ROC on Vergara’s real-world drift dataset and demonstrating strong online-adaptation capabilities. The results show substantial performance gains, underscoring the practical impact for industrial sensor systems, while noting limitations such as dataset generalizability and the need for additional real-world data and unsupervised extensions for broader applicability.

Abstract

Addressing sensor drift is essential in industrial measurement systems, where precise data output is necessary for maintaining accuracy and reliability in monitoring processes, as it progressively degrades the performance of machine learning models over time. Our findings indicate that the standard cross-validation method used in existing model training overestimates performance by inadequately accounting for drift. This is primarily because typical cross-validation techniques allow data instances to appear in both training and testing sets, thereby distorting the accuracy of the predictive evaluation. As a result, these models are unable to precisely predict future drift effects, compromising their ability to generalize and adapt to evolving data conditions. This paper presents two solutions: (1) a novel sensor drift compensation learning paradigm for validating models, and (2) automated machine learning (AutoML) techniques to enhance classification performance and compensate sensor drift. By employing strategies such as data balancing, meta-learning, automated ensemble learning, hyperparameter optimization, feature selection, and boosting, our AutoML-DC (Drift Compensation) model significantly improves classification performance against sensor drift. AutoML-DC further adapts effectively to varying drift severities.

Paper Structure

This paper contains 31 sections, 1 equation, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Traditional training setups are inadequate for learning and compensating sensor drift (left). Our novel training paradigm, encompassing two new setups (marked with 1+2, middle), enables the network to learn drift dynamics during training. By integrating adapted AutoML techniques (right), including feature and model selection as well as hyperparameter optimization, early-stopping, and meta-learning, we prevent overfitting. This approach achieves a new state-of-the-art AutoML-DC model for sensor drift compensation.
  • Figure 2: Visualization of the decision boundary shift due to sensor drift and the associated incorrect prediction (left) and the usage of AutoML techniques for drift compensation (right).
  • Figure 3: Vizualisation of the two-fold training paradigm on the right side vs. the traditional training paradigm on the left side.
  • Figure 4: ROC Curves for all evaluated methods, best viewed in color and zoom in.
  • Figure 5: Decision Boundaries of Random Forest versus Support Vector Machine with RBF-Kernel, best viewed in color and zoom in.
  • ...and 4 more figures