Table of Contents
Fetching ...

Spurious Correlations in Concept Drift: Can Explanatory Interaction Help?

Cristiana Lalletti, Stefano Teso

TL;DR

The paper addresses the challenge that spurious correlations (SCs) in confounded data distort drift-detection statistics in online learning. It proposes ebc-exstream, an explanation-based drift detector in the exstream family that uses SHAP explanations and an entropy-based heuristic to trigger a human-in-the-loop for identifying and deconfounding SCs, thereby improving drift detection under confounding. The method extends the Exstream framework with an interactive step and a deconfounding procedure that randomizes the identified spurious features in training. Preliminary experiments on artificially confounded data show that ebc-exstream improves detection timing and reduces annotation burden compared to Exstream and baseline detectors. The work highlights a critical interaction between SCs and drift detection and lays groundwork for more robust online detectors in non-IID, confounded environments.

Abstract

Long-running machine learning models face the issue of concept drift (CD), whereby the data distribution changes over time, compromising prediction performance. Updating the model requires detecting drift by monitoring the data and/or the model for unexpected changes. We show that, however, spurious correlations (SCs) can spoil the statistics tracked by detection algorithms. Motivated by this, we introduce ebc-exstream, a novel detector that leverages model explanations to identify potential SCs and human feedback to correct for them. It leverages an entropy-based heuristic to reduce the amount of necessary feedback, cutting annotation costs. Our preliminary experiments on artificially confounded data highlight the promise of ebc-exstream for reducing the impact of SCs on detection.

Spurious Correlations in Concept Drift: Can Explanatory Interaction Help?

TL;DR

The paper addresses the challenge that spurious correlations (SCs) in confounded data distort drift-detection statistics in online learning. It proposes ebc-exstream, an explanation-based drift detector in the exstream family that uses SHAP explanations and an entropy-based heuristic to trigger a human-in-the-loop for identifying and deconfounding SCs, thereby improving drift detection under confounding. The method extends the Exstream framework with an interactive step and a deconfounding procedure that randomizes the identified spurious features in training. Preliminary experiments on artificially confounded data show that ebc-exstream improves detection timing and reduces annotation burden compared to Exstream and baseline detectors. The work highlights a critical interaction between SCs and drift detection and lays groundwork for more robust online detectors in non-IID, confounded environments.

Abstract

Long-running machine learning models face the issue of concept drift (CD), whereby the data distribution changes over time, compromising prediction performance. Updating the model requires detecting drift by monitoring the data and/or the model for unexpected changes. We show that, however, spurious correlations (SCs) can spoil the statistics tracked by detection algorithms. Motivated by this, we introduce ebc-exstream, a novel detector that leverages model explanations to identify potential SCs and human feedback to correct for them. It leverages an entropy-based heuristic to reduce the amount of necessary feedback, cutting annotation costs. Our preliminary experiments on artificially confounded data highlight the promise of ebc-exstream for reducing the impact of SCs on detection.
Paper Structure (9 sections, 1 figure)

This paper contains 9 sections, 1 figure.

Figures (1)

  • Figure 1: Alarms raised by ebc-exstream (right) and exstream (left) on c-stagger (top) and c-electricity (bottom). Time is the $x$-axis, yellow bars denote ground-truth drift events, cyan dashed bars a heuristic acceptable delay.