Castor: Competing shapelets for fast and accurate time series classification
Isak Samsten, Zed Lee
TL;DR
Castor introduces a competing-dilated shapelet transform for time series classification, organizing randomly sampled shapelets into groups that compete over temporal contexts to produce a rich, distance-based feature space. By incorporating both dilation and competition, plus optional first-order differences and z-normalization mix, Castor achieves state-of-the-art accuracy among shapelet-based methods and competitive performance relative to random-convolution and dictionary-based approaches. The paper provides a comprehensive ablation study to identify robust default hyperparameters, demonstrates linear scalability with data size, and offers an open-source implementation for reproducibility. Overall, Castor offers a fast, accurate, and interpretable alternative for time series classification with strong empirical evidence across a large benchmark.
Abstract
Shapelets are discriminative subsequences, originally embedded in shapelet-based decision trees but have since been extended to shapelet-based transformations. We propose Castor, a simple, efficient, and accurate time series classification algorithm that utilizes shapelets to transform time series. The transformation organizes shapelets into groups with varying dilation and allows the shapelets to compete over the time context to construct a diverse feature representation. By organizing the shapelets into groups, we enable the transformation to transition between levels of competition, resulting in methods that more closely resemble distance-based transformations or dictionary-based transformations. We demonstrate, through an extensive empirical investigation, that Castor yields transformations that result in classifiers that are significantly more accurate than several state-of-the-art classifiers. In an extensive ablation study, we examine the effect of choosing hyperparameters and suggest accurate and efficient default values.
