Abstracted Shapes as Tokens -- A Generalizable and Interpretable Model for Time-series Classification
Yunshi Wen, Tengfei Ma, Tsui-Wei Weng, Lam M. Nguyen, Anak Agung Julius
TL;DR
VQShape addresses the need for interpretable, generalizable time-series representations across domains by learning discrete shape-level tokens via a low-dimensional codebook. Built on a patch-based transformer encoder, it jointly learns abstracted shapes and their attributes through a suite of self-supervised objectives, including reconstruction, vector quantization, and disentanglement losses. The model yields two downstream representations—latent-space tokens and code histograms—enabling interpretable classification while maintaining competitive accuracy with state-of-the-art baselines. Empirical results on 29 UEA datasets demonstrate cross-domain generalization, while visualizations and ablations reveal a universal codebook of shapes and discriminative, rule-like features in code histograms.
Abstract
In time-series analysis, many recent works seek to provide a unified view and representation for time-series across multiple domains, leading to the development of foundation models for time-series data. Despite diverse modeling techniques, existing models are black boxes and fail to provide insights and explanations about their representations. In this paper, we present VQShape, a pre-trained, generalizable, and interpretable model for time-series representation learning and classification. By introducing a novel representation for time-series data, we forge a connection between the latent space of VQShape and shape-level features. Using vector quantization, we show that time-series from different domains can be described using a unified set of low-dimensional codes, where each code can be represented as an abstracted shape in the time domain. On classification tasks, we show that the representations of VQShape can be utilized to build interpretable classifiers, achieving comparable performance to specialist models. Additionally, in zero-shot learning, VQShape and its codebook can generalize to previously unseen datasets and domains that are not included in the pre-training process. The code and pre-trained weights are available at https://github.com/YunshiWen/VQShape.
