Table of Contents
Fetching ...

Early Concept Drift Detection via Prediction Uncertainty

Pengqian Lu, Jie Lu, Anjin Liu, Guangquan Zhang

TL;DR

The paper tackles concept drift in streaming data, showing that error-rate based detectors can miss early distributional changes. It introduces the Prediction Uncertainty Index (PU-index) with $u_i = 1 - f_{y_i}(x_i)$ and develops PUDD, a drift detector built on an Adaptive PU-index Bucketing scheme and Pearson’s Chi-square testing. The authors prove theoretical properties (Theorem 1 and Theorem 2) establishing PU-index as at least as sensitive as error-rate signals and capable of detecting drift beyond what error rates reveal, then validate the approach on synthetic and real-world datasets, including CIFAR-10-CD. Empirical results demonstrate that PUDD often outperforms classic detectors and SOTA methods, with the bucketing strategy providing notable gains, suggesting substantial practical impact for robust, early drift monitoring in diverse domains.

Abstract

Concept drift, characterized by unpredictable changes in data distribution over time, poses significant challenges to machine learning models in streaming data scenarios. Although error rate-based concept drift detectors are widely used, they often fail to identify drift in the early stages when the data distribution changes but error rates remain constant. This paper introduces the Prediction Uncertainty Index (PU-index), derived from the prediction uncertainty of the classifier, as a superior alternative to the error rate for drift detection. Our theoretical analysis demonstrates that: (1) The PU-index can detect drift even when error rates remain stable. (2) Any change in the error rate will lead to a corresponding change in the PU-index. These properties make the PU-index a more sensitive and robust indicator for drift detection compared to existing methods. We also propose a PU-index-based Drift Detector (PUDD) that employs a novel Adaptive PU-index Bucketing algorithm for detecting drift. Empirical evaluations on both synthetic and real-world datasets demonstrate PUDD's efficacy in detecting drift in structured and image data.

Early Concept Drift Detection via Prediction Uncertainty

TL;DR

The paper tackles concept drift in streaming data, showing that error-rate based detectors can miss early distributional changes. It introduces the Prediction Uncertainty Index (PU-index) with and develops PUDD, a drift detector built on an Adaptive PU-index Bucketing scheme and Pearson’s Chi-square testing. The authors prove theoretical properties (Theorem 1 and Theorem 2) establishing PU-index as at least as sensitive as error-rate signals and capable of detecting drift beyond what error rates reveal, then validate the approach on synthetic and real-world datasets, including CIFAR-10-CD. Empirical results demonstrate that PUDD often outperforms classic detectors and SOTA methods, with the bucketing strategy providing notable gains, suggesting substantial practical impact for robust, early drift monitoring in diverse domains.

Abstract

Concept drift, characterized by unpredictable changes in data distribution over time, poses significant challenges to machine learning models in streaming data scenarios. Although error rate-based concept drift detectors are widely used, they often fail to identify drift in the early stages when the data distribution changes but error rates remain constant. This paper introduces the Prediction Uncertainty Index (PU-index), derived from the prediction uncertainty of the classifier, as a superior alternative to the error rate for drift detection. Our theoretical analysis demonstrates that: (1) The PU-index can detect drift even when error rates remain stable. (2) Any change in the error rate will lead to a corresponding change in the PU-index. These properties make the PU-index a more sensitive and robust indicator for drift detection compared to existing methods. We also propose a PU-index-based Drift Detector (PUDD) that employs a novel Adaptive PU-index Bucketing algorithm for detecting drift. Empirical evaluations on both synthetic and real-world datasets demonstrate PUDD's efficacy in detecting drift in structured and image data.

Paper Structure

This paper contains 27 sections, 3 theorems, 36 equations, 8 figures, 6 tables, 2 algorithms.

Key Result

Theorem 1

Let $W_1$ and $W_2$ be two windows of a data stream in a multi-class classification problem. If their respective PU-index histograms $H_1$ and $H_2$ are identical, where the histograms are constructed such that the first bin contains all misclassified instances and the remaining bins partition the m

Figures (8)

  • Figure 1: Illustrative example of the early stage of concept drift when an error rate-based detector fails to detect concept drift, but a prediction uncertainty-based detector can. The data around decision boundaries have been highlighted in the middle of the figure, showing the distribution gap between test sets 1 and 2. Such a gap implies concept drift occurrence. However, in this case, the error rates of the two test sets are the same. We also provide a theoretical proof in the Appendix to demonstrate the existence of such a case. By contrast, the distribution of prediction uncertainty has changed. The example implies that a prediction uncertainty-based detector can detect drift when an error rate-based detector fails.
  • Figure 2: The framework of our proposed algorithm. The sliding window strategy has two components, i.e., antiquated data discard and cutting point exploration as shown on the left. The Adaptive PU-index Bucketing algorithm is shown in the middle. The drift detection process is shown on the right.
  • Figure 3: Illustrative example of generating CIFAR-10-CD through the Markov process. The classes marked in red represent the user's initial interest and are considered positive labels. All other classes are considered negative labels.
  • Figure 4: Comparison with baselines on CIFAR-10-CD, excluding methods unable to detect drift in image datasets.
  • Figure 5: Accuracy comparison between PUDD (using Adaptive PU-index Bucketing) and Ei-kMeans (EK). We show average accuracy across 9 datasets using 3 classifiers.
  • ...and 3 more figures

Theorems & Definitions (8)

  • Theorem 1
  • Theorem 2
  • Definition 1
  • Definition 2
  • Theorem 3
  • proof
  • proof
  • proof