Table of Contents
Fetching ...

Class-Dependent Perturbation Effects in Evaluating Time Series Attributions

Gregor Baer, Isel Grau, Chao Zhang, Pieter Van Gorp

TL;DR

The paper investigates class-dependent effects in perturbation-based evaluation of time series feature attributions and uncovers that perturbation effectiveness can vary across predicted classes. It introduces the class-adjusted degradation score DS_c with a cross-class penalty Delta, defined via $DS_c(alpha) = DS_bar - alpha * Delta$, to balance aggregate attribution quality with cross-class consistency. Through experiments on four UCR datasets using two architectures and five attribution methods, it demonstrates that high aggregate performance often correlates with pronounced class-specific effects, and that optimal perturbation strategies vary by dataset and model. The work provides practical guidance for evaluation, including class-stratified analyses and the DS_c penalty, and outlines future directions toward class-aware perturbation strategies and adaptive evaluation procedures.

Abstract

As machine learning models become increasingly prevalent in time series applications, Explainable Artificial Intelligence (XAI) methods are essential for understanding their predictions. Within XAI, feature attribution methods aim to identify which input features contribute the most to a model's prediction, with their evaluation typically relying on perturbation-based metrics. Through systematic empirical analysis across multiple datasets, model architectures, and perturbation strategies, we reveal previously overlooked class-dependent effects in these metrics: they show varying effectiveness across classes, achieving strong results for some while remaining less sensitive to others. In particular, we find that the most effective perturbation strategies often demonstrate the most pronounced class differences. Our analysis suggests that these effects arise from the learned biases of classifiers, indicating that perturbation-based evaluation may reflect specific model behaviors rather than intrinsic attribution quality. We propose an evaluation framework with a class-aware penalty term to help assess and account for these effects in evaluating feature attributions, offering particular value for class-imbalanced datasets. Although our analysis focuses on time series classification, these class-dependent effects likely extend to other structured data domains where perturbation-based evaluation is common.

Class-Dependent Perturbation Effects in Evaluating Time Series Attributions

TL;DR

The paper investigates class-dependent effects in perturbation-based evaluation of time series feature attributions and uncovers that perturbation effectiveness can vary across predicted classes. It introduces the class-adjusted degradation score DS_c with a cross-class penalty Delta, defined via , to balance aggregate attribution quality with cross-class consistency. Through experiments on four UCR datasets using two architectures and five attribution methods, it demonstrates that high aggregate performance often correlates with pronounced class-specific effects, and that optimal perturbation strategies vary by dataset and model. The work provides practical guidance for evaluation, including class-stratified analyses and the DS_c penalty, and outlines future directions toward class-aware perturbation strategies and adaptive evaluation procedures.

Abstract

As machine learning models become increasingly prevalent in time series applications, Explainable Artificial Intelligence (XAI) methods are essential for understanding their predictions. Within XAI, feature attribution methods aim to identify which input features contribute the most to a model's prediction, with their evaluation typically relying on perturbation-based metrics. Through systematic empirical analysis across multiple datasets, model architectures, and perturbation strategies, we reveal previously overlooked class-dependent effects in these metrics: they show varying effectiveness across classes, achieving strong results for some while remaining less sensitive to others. In particular, we find that the most effective perturbation strategies often demonstrate the most pronounced class differences. Our analysis suggests that these effects arise from the learned biases of classifiers, indicating that perturbation-based evaluation may reflect specific model behaviors rather than intrinsic attribution quality. We propose an evaluation framework with a class-aware penalty term to help assess and account for these effects in evaluating feature attributions, offering particular value for class-imbalanced datasets. Although our analysis focuses on time series classification, these class-dependent effects likely extend to other structured data domains where perturbation-based evaluation is common.

Paper Structure

This paper contains 14 sections, 5 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Output from Gradients simonyan.etal_2014_deep and Gradient SHAP lundberg.lee_2017_unified attribution methods for InceptionTime ismailfawaz.etal_2020_inceptiontime classifier on one FordB dataset sample. The white line represents the input time series, while the heatmap indicates feature importance for the predicted class over time, with lighter colors denoting higher importance. Attributions were normalized to [0,1].
  • Figure 2: Schematic illustration of time series perturbation based on feature attributions. Reddish rectangles represent feature attribution importance (darker red indicates higher importance). This example shows one perturbation approach: replacing values in the most important region with a constant value (e.g. zero).
  • Figure 3: Example of MoRF and LeRF perturbation curves. Here, we show a desirable outcome, where $\text{PC}_{\text{MoRF}}$ drops quickly as a result of perturbation whereas $\text{PC}_{\text{LeRF}}$ stays unaffected when perturbing the first 40% of least important features, resulting in a high DS, or area between both perturbation curves.
  • Figure 4: Distributions of DS for different attribution methods on the FordB dataset (classifier: InceptionTime, perturbation strategy: SubMean).
  • Figure 5: Class-stratified analysis of DS for different perturbation strategies on the FordB dataset (classifier: InceptionTime, attribution method: FO). Thin lines show individual observations, while thick lines indicate class means.
  • ...and 2 more figures