Table of Contents
Fetching ...

PRISM: Lightweight Multivariate Time-Series Classification through Symmetric Multi-Resolution Convolutional Layers

Federico Zucchi, Thomas Lampert

TL;DR

PRISM introduces a lightweight, fully convolutional approach for multivariate time-series classification by enforcing per-channel symmetry and multi-resolution filtering to capture both fast and slow dynamics with far fewer parameters than Transformer-based models. The method combines a symmetric multi-resolution module with resolution-informed patch embedding and a simple cross-resolution mixing mechanism, producing expressive embeddings while maintaining low FLOPs. Across UEA benchmarks, HAR, Sleep-EDF, and ISRUC-S3, PRISM matches or outperforms state-of-the-art CNNs and Transformers, often with an order of magnitude fewer parameters and compute. Ablation studies confirm the value of multi-resolution diversity and symmetry for spectral selectivity, stopband attenuation, and filter diversity, and point to future improvements through cross-channel interactions and self-supervised pre-training.

Abstract

Multivariate time series classification supports applications from wearable sensing to biomedical monitoring and demands models that can capture both short-term patterns and longer-range temporal dependencies. Despite recent advances, Transformer and CNN models often remain computationally heavy and rely on many parameters. This work presents PRISM(Per-channel Resolution Informed Symmetric Module), a lightweight fully convolutional classifier. Operating in a channel-independent manner, in its early stage it applies a set of multi-resolution symmetric convolutional filters. This symmetry enforces structural constraints inspired by linear-phase FIR filters from classical signal processing, effectively halving the number of learnable parameters within the initial layers while preserving the full receptive field. Across the diverse UEA multivariate time-series archive as well as specific benchmarks in human activity recognition, sleep staging, and biomedical signals, PRISM matches or outperforms state-of-the-art CNN and Transformer models while using significantly fewer parameters and markedly lower computational cost. By bringing a principled signal processing prior into a modern neural architecture, PRISM offers an effective and computationally economical solution for multivariate time series classification.

PRISM: Lightweight Multivariate Time-Series Classification through Symmetric Multi-Resolution Convolutional Layers

TL;DR

PRISM introduces a lightweight, fully convolutional approach for multivariate time-series classification by enforcing per-channel symmetry and multi-resolution filtering to capture both fast and slow dynamics with far fewer parameters than Transformer-based models. The method combines a symmetric multi-resolution module with resolution-informed patch embedding and a simple cross-resolution mixing mechanism, producing expressive embeddings while maintaining low FLOPs. Across UEA benchmarks, HAR, Sleep-EDF, and ISRUC-S3, PRISM matches or outperforms state-of-the-art CNNs and Transformers, often with an order of magnitude fewer parameters and compute. Ablation studies confirm the value of multi-resolution diversity and symmetry for spectral selectivity, stopband attenuation, and filter diversity, and point to future improvements through cross-channel interactions and self-supervised pre-training.

Abstract

Multivariate time series classification supports applications from wearable sensing to biomedical monitoring and demands models that can capture both short-term patterns and longer-range temporal dependencies. Despite recent advances, Transformer and CNN models often remain computationally heavy and rely on many parameters. This work presents PRISM(Per-channel Resolution Informed Symmetric Module), a lightweight fully convolutional classifier. Operating in a channel-independent manner, in its early stage it applies a set of multi-resolution symmetric convolutional filters. This symmetry enforces structural constraints inspired by linear-phase FIR filters from classical signal processing, effectively halving the number of learnable parameters within the initial layers while preserving the full receptive field. Across the diverse UEA multivariate time-series archive as well as specific benchmarks in human activity recognition, sleep staging, and biomedical signals, PRISM matches or outperforms state-of-the-art CNN and Transformer models while using significantly fewer parameters and markedly lower computational cost. By bringing a principled signal processing prior into a modern neural architecture, PRISM offers an effective and computationally economical solution for multivariate time series classification.

Paper Structure

This paper contains 34 sections, 9 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Starting from the left, the figure shows the full PRISM architecture, where each input channel $\mathbf{x}^{i}$ is processed independently through a shared module. The central panel details this module: a symmetric multi-resolution convolution stage generates resolution specific responses, which are then summarised into local feature sequences $\mathbf{Z}^{i}$. These per-resolution streams are fused by a pointwise mixing layer to form the resolution informed embedding sequence $\mathbf{X}^{i}$ of dimension $D$. This sequence is refined pointwise via ReLU activation to introduce non-linearity, and layer normalisation to standardize the representations. The right panel highlights the symmetric filter design, where kernels are mirrored around their centre $m$ (enforcing $w_{m-j}=w_{m+j}$) to reduce parameters. After all channels are processed, feature sequences are pooled across time and channels before being passed to a linear classifier.
  • Figure 2: Accuracy vs. complexity for PRISM and baselines on (a) UCI-HAR, (b) Sleep-EDF, and (c) InsectWingbeat. The marker size is proportional to the computational cost, measured in GFLOPs, providing a visual indication of each model's complexity.
  • Figure 3: Heatmap of the mean classification accuracy, where each cell represents the average performance across all HAR and BIO datasets, as a function of the scale set (rows) and the number of kernels per scale (k) (columns). Each cell is annotated with the corresponding averaged accuracy.
  • Figure 4: Computational complexity (GFLOPs) as a function of the scale set and the number of kernels per scale $k$. Each curve corresponds to a fixed $k$ and illustrates how the overall cost increases when adding new temporal resolutions
  • Figure 5: Comparison of spectral metrics between symmetric and asymmetric PRISM filters across 29 UEA datasets. The three panels report Q-factor, stopband attenuation, and pairwise spectral distance respectively.