Stochastic Sparse Sampling: A Framework for Variable-Length Medical Time Series Classification

Xavier Mootoo; Alan A. Díaz-Montiel; Milad Lankarany; Hina Tabassum

Stochastic Sparse Sampling: A Framework for Variable-Length Medical Time Series Classification

Xavier Mootoo, Alan A. Díaz-Montiel, Milad Lankarany, Hina Tabassum

TL;DR

This work proposes tochastic tochastic sparsely sampling (SSS), a novel VTSC framework developed for medical time series that demonstrates superior performance compared to state-of-the-art (SOTA) baselines across most medical centers, and superior performance on all out-of-distribution (OOD) unseen medical centers.

Abstract

While the majority of time series classification research has focused on modeling fixed-length sequences, variable-length time series classification (VTSC) remains critical in healthcare, where sequence length may vary among patients and events. To address this challenge, we propose $\textbf{S}$tochastic $\textbf{S}$parse $\textbf{S}$ampling (SSS), a novel VTSC framework developed for medical time series. SSS manages variable-length sequences by sparsely sampling fixed windows to compute local predictions, which are then aggregated and calibrated to form a global prediction. We apply SSS to the task of seizure onset zone (SOZ) localization, a critical VTSC problem requiring identification of seizure-inducing brain regions from variable-length electrophysiological time series. We evaluate our method on the Epilepsy iEEG Multicenter Dataset, a heterogeneous collection of intracranial electroencephalography (iEEG) recordings obtained from four independent medical centers. SSS demonstrates superior performance compared to state-of-the-art (SOTA) baselines across most medical centers, and superior performance on all out-of-distribution (OOD) unseen medical centers. Additionally, SSS naturally provides post-hoc insights into local signal characteristics related to the SOZ, by visualizing temporally averaged local predictions throughout the signal.

Stochastic Sparse Sampling: A Framework for Variable-Length Medical Time Series Classification

TL;DR

Abstract

tochastic

parse

ampling (SSS), a novel VTSC framework developed for medical time series. SSS manages variable-length sequences by sparsely sampling fixed windows to compute local predictions, which are then aggregated and calibrated to form a global prediction. We apply SSS to the task of seizure onset zone (SOZ) localization, a critical VTSC problem requiring identification of seizure-inducing brain regions from variable-length electrophysiological time series. We evaluate our method on the Epilepsy iEEG Multicenter Dataset, a heterogeneous collection of intracranial electroencephalography (iEEG) recordings obtained from four independent medical centers. SSS demonstrates superior performance compared to state-of-the-art (SOTA) baselines across most medical centers, and superior performance on all out-of-distribution (OOD) unseen medical centers. Additionally, SSS naturally provides post-hoc insights into local signal characteristics related to the SOZ, by visualizing temporally averaged local predictions throughout the signal.

Paper Structure (33 sections, 1 theorem, 13 equations, 3 figures, 14 tables, 1 algorithm)

This paper contains 33 sections, 1 theorem, 13 equations, 3 figures, 14 tables, 1 algorithm.

Introduction
Related Work
Method
Variable-length time series classification
Stochastic sparse sampling
Sparse Training
Inference
Experiments
Baselines
Dataset
Univariate VTSC
Out-of-Distribution VTSC
Discussion
Review of Results
Conclusion
...and 18 more sections

Key Result

Theorem 1

Fix $K, n \in \mathop{\mathrm{\mathbb{N}}}\nolimits$. Suppose $\alpha_1, \dots, \alpha_n \geq 0$ satisfies $\sum_{i = 1}^n \alpha_i = 1$, and $\mathop{\mathrm{\mathbf{v}}}\nolimits_1, \dots, \mathop{\mathrm{\mathbf{v}}}\nolimits_n \in [0,1]^{K}$ each satisfy $\sum_{j = 1}^K v_{ik} =1$ for $1 \leq i also represents a valid discrete probability distribution, satisfying $\sum_{j = 1}^K y_{j} = 1$.

Figures (3)

Figure 1: An overview of Stochastic Sparse Sampling (SSS) training procedure. ($\textbf{A}$) For a given time series, we sample windows of fixed-length at random throughout the signal. ($\textbf{B}$) Each window is processed independently by a local model with parameters $\theta$, outputting the local predictions $\hat{y}_1, \dots, \hat{y}_k$. ($\textbf{C}$) Local predictions are then fed through an aggregation function to form the final prediction $\hat{y}$.
Figure 2: Visualization of SSS window probabilities throughout iEEG channels at inference time, using the PatchTST backbone with window size $1024$. The heatmap represents locally averaged window probabilities over time, with color intensity being proportional to the likelihood of the channel belonging to the SOZ.
Figure 3: Visualization of SSS window probabilities for OOD iEEG channels at inference time, using the PatchTST backbone with window size $1024$. The heatmap represents locally averaged window probabilities over time, with color intensity being proportional to the likelihood of the channel belonging to the SOZ.

Theorems & Definitions (3)

Theorem 1: Probability Distribution Guarantee
proof
Definition 2

Stochastic Sparse Sampling: A Framework for Variable-Length Medical Time Series Classification

TL;DR

Abstract

Stochastic Sparse Sampling: A Framework for Variable-Length Medical Time Series Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (3)