Explaining Time Series via Contrastive and Locally Sparse Perturbations

Zichuan Liu; Yingying Zhang; Tianchun Wang; Zefan Wang; Dongsheng Luo; Mengnan Du; Min Wu; Yi Wang; Chunlin Chen; Lunting Fan; Qingsong Wen

Explaining Time Series via Contrastive and Locally Sparse Perturbations

Zichuan Liu, Yingying Zhang, Tianchun Wang, Zefan Wang, Dongsheng Luo, Mengnan Du, Min Wu, Yi Wang, Chunlin Chen, Lunting Fan, Qingsong Wen

TL;DR

The paper tackles explaining multivariate time series by addressing distribution shift in perturbations with ContraLSP, a framework that learns counterfactual perturbations via contrastive learning and applies sample-specific sparse gates with an $\ell_0$-like penalty and a temporal trend-based smoothing. The method forms $\Phi(x, m) = m \odot x + (1- m) \odot x^r$ and optimizes a loss that preserves predictions while highlighting salient, temporally coherent features; counterfactuals are guided by a triplet-based objective and sample-wise masks are regularized through an erf-based $\ell_0$ proxy. Across synthetic white-box, black-box classification, and MIMIC-III clinical data, ContraLSP consistently outperforms baselines in information content and mask sharpness, with ablations confirming the value of the triplet loss and the trend-based smoothing. The approach advances explainability for time series by yielding faithful, sparse, and distribution-aligned perturbations, with practical impact for healthcare and finance settings where interpretation matters.

Abstract

Explaining multivariate time series is a compound challenge, as it requires identifying important locations in the time series and matching complex temporal patterns. Although previous saliency-based methods addressed the challenges, their perturbation may not alleviate the distribution shift issue, which is inevitable especially in heterogeneous samples. We present ContraLSP, a locally sparse model that introduces counterfactual samples to build uninformative perturbations but keeps distribution using contrastive learning. Furthermore, we incorporate sample-specific sparse gates to generate more binary-skewed and smooth masks, which easily integrate temporal trends and select the salient features parsimoniously. Empirical studies on both synthetic and real-world datasets show that ContraLSP outperforms state-of-the-art models, demonstrating a substantial improvement in explanation quality for time series data. The source code is available at \url{https://github.com/zichuan-liu/ContraLSP}.

Explaining Time Series via Contrastive and Locally Sparse Perturbations

TL;DR

-like penalty and a temporal trend-based smoothing. The method forms

and optimizes a loss that preserves predictions while highlighting salient, temporally coherent features; counterfactuals are guided by a triplet-based objective and sample-wise masks are regularized through an erf-based

proxy. Across synthetic white-box, black-box classification, and MIMIC-III clinical data, ContraLSP consistently outperforms baselines in information content and mask sharpness, with ablations confirming the value of the triplet loss and the trend-based smoothing. The approach advances explainability for time series by yielding faithful, sparse, and distribution-aligned perturbations, with practical impact for healthcare and finance settings where interpretation matters.

Abstract

Paper Structure (26 sections, 14 equations, 14 figures, 12 tables, 2 algorithms)

This paper contains 26 sections, 14 equations, 14 figures, 12 tables, 2 algorithms.

Introduction
Related Work
Problem Formulation
Our Method
Counterfactuals from Contrastive Learning
Sparse Gates with Smooth Constraint
Learning Objective
Experiments
White-box Regression Simulation
Black-box Classification Simulation
MIMIC-III Mortality Data
Conclusion
Regularization Term
Triple Samples Selected
Pseudo Code
...and 11 more sections

Figures (14)

Figure 1: Illustrating different styles of perturbation. The red line is a sample belonging to class 1 within the two categories, while the dark background indicates the salient features, otherwise non-salient. Other perturbations could be either not uninformative or not in-domain, while ours is counterfactual that is toward the distribution of negative samples.
Figure 2: The architecture of ContraLSP. A sample of features ${\bm{x}}_i\in {\mathbb{R}}^{T\times D}$ is fed simultaneously to a perturbation function $\varphi(\cdot)$ and to a trend function $\tau(\cdot)$. The perturbation function $\varphi(\cdot)$ uses ${\bm{x}}_i$ to generate counterfactuals ${\bm{x}}^r_i$ that are closer to other negative samples (but within the sample domain) through contrastive learning. In addition, $\tau(\cdot)$ learns to predict temporal trends, which together with a set of parameters $\bm{\mu}_i$ depicts the smooth vectors $\bm{\mu}'_i$. It acts on the locally sparse gates by injecting noises $\bm{\epsilon}_i$ to get the mask ${\bm{m}}_i$. Finally, the counterfactuals are replaced with perturbed features and the predictions are compared to the original results to determine which features are salient enough.
Figure 3: Illustration of the impact of triplet loss to generate counterfactual perturbations. The anchor is closer to negatives but farther from positives.
Figure 4: Different temperatures for the sigmoid-weighted unit. The learned trend function $\tau(\cdot)$ can be better adapted to smooth vectors (red) to hard masks (black).
Figure 5: Differences between ContraLSP and Extrmask perturbations on the Rare-Observation (Diffgroups) experiment. We randomly select a sample in each of the two groups and sum all observations. The background color represents the mask value, with darker colors indicating higher values. ContraLSP provides counterfactual information, yet Extrmask's perturbation is close to 0.
...and 9 more figures

Explaining Time Series via Contrastive and Locally Sparse Perturbations

TL;DR

Abstract

Explaining Time Series via Contrastive and Locally Sparse Perturbations

Authors

TL;DR

Abstract

Table of Contents

Figures (14)