Table of Contents
Fetching ...

Classification of High-dimensional Time Series in Spectral Domain using Explainable Features

Sarbojit Roy, Malik Shahid Sultan, Hernando Ombao

TL;DR

This work tackles high-dimensional time series classification in the spectral domain by exploiting sparsity in the difference of inverse spectral density matrices across classes, i.e., $\widetilde{\mathbf{D}}_k = \boldsymbol{\Theta}_{2k}-\boldsymbol{\Theta}_{1k}$ for frequencies $\omega_k \in \Omega_T$. It develops two estimation routes: (i) a convex D-trace loss with constrained $\ell_1$ minimization to directly estimate $\widetilde{\mathbf{D}}_k$ from class-specific SDMs and (ii) an ADAM-based joint estimator that optimizes a penalized likelihood combining cross-entropy with the D-trace term, enabling scalable training with mini-batching. A frequency-screening procedure using the Frobenius norms $d_k=\|\widetilde{\mathbf{D}}_k\|_F$ yields a sure-screening property to identify signal frequencies $\Omega^\mathbf{D}_T$, while allowing covariate importance to vary by frequency. Theoretical guarantees establish consistency of $\hat{\widetilde{\mathbf{D}}}_k$ and the screening step in ultrahigh dimensions, and empirical results on simulations and an Alert-vs-Drowsy EEG dataset demonstrate strong classification performance together with interpretable, frequency- and edge-specific connectivity differences that are valuable for neuroscience applications.

Abstract

Interpretable classification of time series presents significant challenges in high dimensions. Traditional feature selection methods in the frequency domain often assume sparsity in spectral density matrices (SDMs) or their inverses, which can be restrictive for real-world applications. In this article, we propose a model-based approach for classifying high-dimensional stationary time series by assuming sparsity in the difference between inverse SDMs. Our approach emphasizes the interpretability of model parameters, making it especially suitable for fields like neuroscience, where understanding differences in brain network connectivity across various states is crucial. The estimators for model parameters demonstrate consistency under appropriate conditions. We further propose using standard deep learning optimizers for parameter estimation, employing techniques such as mini-batching and learning rate scheduling. Additionally, we introduce a method to screen the most discriminatory frequencies for classification, which exhibits the sure screening property under general conditions. The flexibility of the proposed model allows the significance of covariates to vary across frequencies, enabling nuanced inferences and deeper insights into the underlying problem. The novelty of our method lies in the interpretability of the model parameters, addressing critical needs in neuroscience. The proposed approaches have been evaluated on simulated examples and the `Alert-vs-Drowsy' EEG dataset.

Classification of High-dimensional Time Series in Spectral Domain using Explainable Features

TL;DR

This work tackles high-dimensional time series classification in the spectral domain by exploiting sparsity in the difference of inverse spectral density matrices across classes, i.e., for frequencies . It develops two estimation routes: (i) a convex D-trace loss with constrained minimization to directly estimate from class-specific SDMs and (ii) an ADAM-based joint estimator that optimizes a penalized likelihood combining cross-entropy with the D-trace term, enabling scalable training with mini-batching. A frequency-screening procedure using the Frobenius norms yields a sure-screening property to identify signal frequencies , while allowing covariate importance to vary by frequency. Theoretical guarantees establish consistency of and the screening step in ultrahigh dimensions, and empirical results on simulations and an Alert-vs-Drowsy EEG dataset demonstrate strong classification performance together with interpretable, frequency- and edge-specific connectivity differences that are valuable for neuroscience applications.

Abstract

Interpretable classification of time series presents significant challenges in high dimensions. Traditional feature selection methods in the frequency domain often assume sparsity in spectral density matrices (SDMs) or their inverses, which can be restrictive for real-world applications. In this article, we propose a model-based approach for classifying high-dimensional stationary time series by assuming sparsity in the difference between inverse SDMs. Our approach emphasizes the interpretability of model parameters, making it especially suitable for fields like neuroscience, where understanding differences in brain network connectivity across various states is crucial. The estimators for model parameters demonstrate consistency under appropriate conditions. We further propose using standard deep learning optimizers for parameter estimation, employing techniques such as mini-batching and learning rate scheduling. Additionally, we introduce a method to screen the most discriminatory frequencies for classification, which exhibits the sure screening property under general conditions. The flexibility of the proposed model allows the significance of covariates to vary across frequencies, enabling nuanced inferences and deeper insights into the underlying problem. The novelty of our method lies in the interpretability of the model parameters, addressing critical needs in neuroscience. The proposed approaches have been evaluated on simulated examples and the `Alert-vs-Drowsy' EEG dataset.
Paper Structure (15 sections, 8 theorems, 63 equations, 2 figures, 3 tables)

This paper contains 15 sections, 8 theorems, 63 equations, 2 figures, 3 tables.

Key Result

Theorem 1

If the assumption A1 is satisfied for $\eta_1>2$, $\min\{n_1,n_2\}> C\max_k\bar{\sigma}_k^{-2}(\eta_1\ln{p} + \ln{4})$ for some $C>0$ and $\widetilde{\mathbf{D}}_k$ is estimated using $\lambda_{\eta_1 k}$ in gic, then

Figures (2)

  • Figure 1: The left panel shows shows the true sparse structure of $\widetilde{\mathbf{D}}_1$ in Example 2. The middle and the right panels show the structures estimated using Dtrace and $\ell$, respectively (based on the first iteration).
  • Figure 2: (e)-(d)--Clockwise from bottom-left -- the interactions between channels for frequency bands $\theta,\gamma,\alpha,\beta$ and $\delta$ are displayed using heatmaps. (f) The difference between auto-covariance matrices $\boldsymbol{\Gamma}_1(0)$ and $\boldsymbol{\Gamma}_2(0)$. $|\mathbf{M}|$ denotes elements wise absolute value of matrix $\mathbf{M}$.

Theorems & Definitions (13)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof
  • Lemma 2
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Lemma 5
  • ...and 3 more