Table of Contents
Fetching ...

Adaptive Structured Pruning of Convolutional Neural Networks for Time Series Classification

Javidan Abdullayev, Maxime Devanne, Cyril Meyer, Ali Ismail-Fawaz, Jonathan Weber, Germain Forestier

TL;DR

Dynamic Structured Pruning is proposed, a fully automatic, structured pruning framework for convolution-based TSC models that achieves an average compression of 58% for LITETime and 75% for InceptionTime architectures while maintaining classification accuracy.

Abstract

Deep learning models for Time Series Classification (TSC) have achieved strong predictive performance but their high computational and memory requirements often limit deployment on resource-constrained devices. While structured pruning can address these issues by removing redundant filters, existing methods typically rely on manually tuned hyperparameters such as pruning ratios which limit scalability and generalization across datasets. In this work, we propose Dynamic Structured Pruning (DSP), a fully automatic, structured pruning framework for convolution-based TSC models. DSP introduces an instance-wise sparsity loss during training to induce channel-level sparsity, followed by a global activation analysis to identify and prune redundant filters without needing any predefined pruning ratio. This work tackles computational bottlenecks of deep TSC models for deployment on resource-constrained devices. We validate DSP on 128 UCR datasets using two different deep state-of-the-art architectures: LITETime and InceptionTime. Our approach achieves an average compression of 58% for LITETime and 75% for InceptionTime architectures while maintaining classification accuracy. Redundancy analyses confirm that DSP produces compact and informative representations, offering a practical path for scalable and efficient deep TSC deployment.

Adaptive Structured Pruning of Convolutional Neural Networks for Time Series Classification

TL;DR

Dynamic Structured Pruning is proposed, a fully automatic, structured pruning framework for convolution-based TSC models that achieves an average compression of 58% for LITETime and 75% for InceptionTime architectures while maintaining classification accuracy.

Abstract

Deep learning models for Time Series Classification (TSC) have achieved strong predictive performance but their high computational and memory requirements often limit deployment on resource-constrained devices. While structured pruning can address these issues by removing redundant filters, existing methods typically rely on manually tuned hyperparameters such as pruning ratios which limit scalability and generalization across datasets. In this work, we propose Dynamic Structured Pruning (DSP), a fully automatic, structured pruning framework for convolution-based TSC models. DSP introduces an instance-wise sparsity loss during training to induce channel-level sparsity, followed by a global activation analysis to identify and prune redundant filters without needing any predefined pruning ratio. This work tackles computational bottlenecks of deep TSC models for deployment on resource-constrained devices. We validate DSP on 128 UCR datasets using two different deep state-of-the-art architectures: LITETime and InceptionTime. Our approach achieves an average compression of 58% for LITETime and 75% for InceptionTime architectures while maintaining classification accuracy. Redundancy analyses confirm that DSP produces compact and informative representations, offering a practical path for scalable and efficient deep TSC deployment.
Paper Structure (27 sections, 6 equations, 13 figures, 2 tables, 1 algorithm)

This paper contains 27 sections, 6 equations, 13 figures, 2 tables, 1 algorithm.

Figures (13)

  • Figure 1: Accuracy versus FLOPS for baseline and pruned (DSP) models across LITETime and InceptionTime architectures on the EthanolLevel dataset. Marker size reflects the number of parameters (log-scaled). In this comparison, our DSP models achieve better test performance while significantly reducing computational cost and model size.
  • Figure 2: An overview of the proposed DSP framework.
  • Figure 3: Illustration of sparsity-based training using instance-wise sparsity loss to induce channel sparsity .
  • Figure 4: Pruning strategy that leverages feature activations across the dataset samples. Here, $f_{c}$ refers to the output activation of the $c^{th}$ channel for a given instance and $m_{n}$ represents one of the N input instances in the dataset.
  • Figure 5: Comparison of Base, Pretrained, Pruned, Finetuned, and Scratch-trained models for LITE and Inception architectures.
  • ...and 8 more figures