Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

Arshia Kermani; Ehsan Zeraatkar; Habib Irani

Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

Arshia Kermani, Ehsan Zeraatkar, Habib Irani

TL;DR

This work addresses the high energy cost of transformer inference in time series classification by systematically evaluating pruning and quantization strategies. It deploys a Vision Transformer–based Time Series model on three datasets (RefrigerationDevices, ElectricDevices, PLAID) and analyzes two configurations (T1 and T2) to reveal capacity–efficiency trade-offs. The results show static quantization saves about 29.14% energy with minimal accuracy loss, while L1 pruning provides up to 1.63× faster inference and 37.08% energy savings; 8-bit quantization maintains accuracy with only ~1.4–1.8% degradation. The study demonstrates that a hybrid of pruning and quantization can achieve up to 45.7% overall energy reduction with limited accuracy loss, offering actionable guidance for edge and resource-constrained deployments of transformer-based time series classifiers.

Abstract

The increasing computational demands of transformer models in time series classification necessitate effective optimization strategies for energy-efficient deployment. Our study presents a systematic investigation of optimization techniques, focusing on structured pruning and quantization methods for transformer architectures. Through extensive experimentation on three distinct datasets (RefrigerationDevices, ElectricDevices, and PLAID), we quantitatively evaluate model performance and energy efficiency across different transformer configurations. Our experimental results demonstrate that static quantization reduces energy consumption by 29.14% while maintaining classification performance, and L1 pruning achieves a 63% improvement in inference speed with minimal accuracy degradation. Our findings provide valuable insights into the effectiveness of optimization strategies for transformer-based time series classification, establishing a foundation for efficient model deployment in resource-constrained environments.

Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

TL;DR

Abstract

Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)