Table of Contents
Fetching ...

ShiftDTW: adapting the DTW metric for cyclic time series clustering

Lucas Foulon, Ilyes Korichi, Xavier Millot

TL;DR

ShiftDTW addresses clustering of cyclic time series by integrating a Sakoe-Chiba–bounded DTW variant with K-Means, preserving cyclic alignment without incurring CDTW’s full cost. It standardizes seasonality using Prophet and then leverages a doubled distance matrix to explore multiple bounded alignments, achieving $O(mn)$ time like DTW. The method preserves per-series shifts during centroid updates, enabling effective grouping of seasonally offset patterns. Empirical results on synthetic cyclic data and real accounting series show shifts can improve clustering over Euclidean and DTW baselines, while remaining computationally efficient. Future work includes broader benchmarks, comparisons with CDTW, and extensions to multi-indicator and barycenter-aware DTW clustering.

Abstract

The elasticity of the DTW metric provides a more flexible comparison between time series and is used in numerous machine learning domains such as classification or clustering. However, it does not align the measurements at the beginning and end of time series if they have a shift occurring right at the start of one series, with the omitted part appearing at the end of that series. Due to the cyclicity of such series - which lack a definite beginning or end - we rely on the Cyclic DTW approach to propose a less computationally expensive approximation of this calculation method. This approximation will then be employed in conjunction with the K-Means clustering method.

ShiftDTW: adapting the DTW metric for cyclic time series clustering

TL;DR

ShiftDTW addresses clustering of cyclic time series by integrating a Sakoe-Chiba–bounded DTW variant with K-Means, preserving cyclic alignment without incurring CDTW’s full cost. It standardizes seasonality using Prophet and then leverages a doubled distance matrix to explore multiple bounded alignments, achieving time like DTW. The method preserves per-series shifts during centroid updates, enabling effective grouping of seasonally offset patterns. Empirical results on synthetic cyclic data and real accounting series show shifts can improve clustering over Euclidean and DTW baselines, while remaining computationally efficient. Future work includes broader benchmarks, comparisons with CDTW, and extensions to multi-indicator and barycenter-aware DTW clustering.

Abstract

The elasticity of the DTW metric provides a more flexible comparison between time series and is used in numerous machine learning domains such as classification or clustering. However, it does not align the measurements at the beginning and end of time series if they have a shift occurring right at the start of one series, with the omitted part appearing at the end of that series. Due to the cyclicity of such series - which lack a definite beginning or end - we rely on the Cyclic DTW approach to propose a less computationally expensive approximation of this calculation method. This approximation will then be employed in conjunction with the K-Means clustering method.
Paper Structure (13 sections, 4 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 4 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Matrices des distances entre deux séries temporelles. Les deux séries possèdent les mêmes valeurs, mais l'une d'elle a été décalé dans le temps.
  • Figure 2: La matrice des distances doublée entre deux séries temporelles.
  • Figure 3: Matrices des distances entre deux séries temporelles avec un masque de Sakoe-Chiba. La matrice de la première itération est similaire au calcul DTW avec masque présenté dans la figure \ref{['fig:def:matrice_distance:with_mask']}. La seconde itération retourne un score plus petit et donc meilleur.
  • Figure 4: Présentation des 6 clusters sélectionnés parmi les 100 clusters calculés avec K-Means euclidien.
  • Figure 5: Résultats du clustering avec les séries temporelles de longueur 12.
  • ...and 2 more figures