Table of Contents
Fetching ...

Robust Explainer Recommendation for Time Series Classification

Thu Trang Nguyen, Thach Le Nguyen, Georgiana Ifrim

TL;DR

This paper addresses the problem of robustly evaluating and ranking saliency-based explanations for time series classification (TSC). It introduces AMEE, a Model-Agnostic Explanation Evaluation framework that uses explanation-guided perturbations and a committee of referee classifiers to measure how informative an explanation is. By aggregating across multiple perturbation strategies and diverse classifiers, AMEE provides a standardized Explanation Power metric that aligns with ground-truth saliency both in synthetic datasets and expert-labeled real data. The work demonstrates that SHAP-based, perturbation-driven explanations often outperform gradient-based ones under similar base-model performance, and it emphasizes practical guidelines for applying AMEE in diverse time series tasks and beyond.

Abstract

Time series classification is a task which deals with temporal sequences, a prevalent data type common in domains such as human activity recognition, sports analytics and general sensing. In this area, interest in explainability has been growing as explanation is key to understand the data and the model better. Recently, a great variety of techniques have been proposed and adapted for time series to provide explanation in the form of saliency maps, where the importance of each data point in the time series is quantified with a numerical value. However, the saliency maps can and often disagree, so it is unclear which one to use. This paper provides a novel framework to quantitatively evaluate and rank explanation methods for time series classification. We show how to robustly evaluate the informativeness of a given explanation method (i.e., relevance for the classification task), and how to compare explanations side-by-side. The goal is to recommend the best explainer for a given time series classification dataset. We propose AMEE, a Model-Agnostic Explanation Evaluation framework, for recommending saliency-based explanations for time series classification. In this approach, data perturbation is added to the input time series guided by each explanation. Our results show that perturbing discriminative parts of the time series leads to significant changes in classification accuracy, which can be used to evaluate each explanation. To be robust to different types of perturbations and different types of classifiers, we aggregate the accuracy loss across perturbations and classifiers. This novel approach allows us to recommend the best explainer among a set of different explainers, including random and oracle explainers. We provide a quantitative and qualitative analysis for synthetic datasets, a variety of timeseries datasets, as well as a real-world case study with known expert ground truth.

Robust Explainer Recommendation for Time Series Classification

TL;DR

This paper addresses the problem of robustly evaluating and ranking saliency-based explanations for time series classification (TSC). It introduces AMEE, a Model-Agnostic Explanation Evaluation framework that uses explanation-guided perturbations and a committee of referee classifiers to measure how informative an explanation is. By aggregating across multiple perturbation strategies and diverse classifiers, AMEE provides a standardized Explanation Power metric that aligns with ground-truth saliency both in synthetic datasets and expert-labeled real data. The work demonstrates that SHAP-based, perturbation-driven explanations often outperform gradient-based ones under similar base-model performance, and it emphasizes practical guidelines for applying AMEE in diverse time series tasks and beyond.

Abstract

Time series classification is a task which deals with temporal sequences, a prevalent data type common in domains such as human activity recognition, sports analytics and general sensing. In this area, interest in explainability has been growing as explanation is key to understand the data and the model better. Recently, a great variety of techniques have been proposed and adapted for time series to provide explanation in the form of saliency maps, where the importance of each data point in the time series is quantified with a numerical value. However, the saliency maps can and often disagree, so it is unclear which one to use. This paper provides a novel framework to quantitatively evaluate and rank explanation methods for time series classification. We show how to robustly evaluate the informativeness of a given explanation method (i.e., relevance for the classification task), and how to compare explanations side-by-side. The goal is to recommend the best explainer for a given time series classification dataset. We propose AMEE, a Model-Agnostic Explanation Evaluation framework, for recommending saliency-based explanations for time series classification. In this approach, data perturbation is added to the input time series guided by each explanation. Our results show that perturbing discriminative parts of the time series leads to significant changes in classification accuracy, which can be used to evaluate each explanation. To be robust to different types of perturbations and different types of classifiers, we aggregate the accuracy loss across perturbations and classifiers. This novel approach allows us to recommend the best explainer among a set of different explainers, including random and oracle explainers. We provide a quantitative and qualitative analysis for synthetic datasets, a variety of timeseries datasets, as well as a real-world case study with known expert ground truth.
Paper Structure (44 sections, 3 equations, 21 figures, 11 tables, 2 algorithms)

This paper contains 44 sections, 3 equations, 21 figures, 11 tables, 2 algorithms.

Figures (21)

  • Figure 1: Saliency map explanation is a vector of feature importance weights overlaid over the original time series, where each point in the time series is coloured according to its importance. The saliency is obtained by classifying a motion time series using different classifiers and explainers. The most discriminative parts according to the explanation method are colored in deep red, and the non-discriminative parts are colored in deep blue.
  • Figure 2: Saliency map from two explanation methods on two examples from the Coffee dataset: the bottom row is an explanation from MrSEQL Classifier (intrinsic explanation); the top row is an explanation from SHAP, a post-hoc explanation method based on MrSEQL Classifier.
  • Figure 3: Saliency map from two explanation methods on two examples of GunPoint dataset: the top bottom row is explanation from MrSEQL Classifier (intrinsic explanation); the top row is explanation from SHAP, a post-hoc explanation method based on MrSEQL Classifier.
  • Figure 4: Time Series Data Perturbation strategy: An example time series with a known saliency map (left) is perturbed using mean or Gaussian noise using local time steps (local) or global time steps (global) across the entire dataset on its most discriminative region (in this example we perturb the top 20% values according to the highest saliency weights).
  • Figure 5: The AMEE evaluation framework requires 3 elements: (a) a dataset that requires explanation evaluation, (b) a set of saliency-based explanations, and (c) a set of referee classifiers trained on a subset of (a).
  • ...and 16 more figures