On the clustering behavior of sliding windows

Boris Alexeev; Wenyan Luo; Dustin G. Mixon; Yan X Zhang

On the clustering behavior of sliding windows

Boris Alexeev, Wenyan Luo, Dustin G. Mixon, Yan X Zhang

TL;DR

This paper analyzes the pitfalls of clustering sliding windows of time series under $k$-means and spectral clustering, showing that the window length relative to the series length drives qualitatively different failure modes. It provides theoretical bounds and constructions explaining why small windows yield flat centroids, why near-symmetric window arrangements produce sinusoidal centroids, and why large windows tend to produce interval-based clusters. The results combine probabilistic and spectral-analysis techniques, including PCA, Wedin's sinΘ theorem, and Grassmannian distance concepts, and are illustrated with real and synthetic data. Collectively, the findings inform how to interpret cluster structure in sliding-window representations and suggest cautions for choosing $w$ and $k$ in time-series clustering tasks.

Abstract

Things can go spectacularly wrong when clustering timeseries data that has been preprocessed with a sliding window. We highlight three surprising failures that emerge depending on how the window size compares with the timeseries length. In addition to computational examples, we present theoretical explanations for each of these failure modes.

On the clustering behavior of sliding windows

TL;DR

This paper analyzes the pitfalls of clustering sliding windows of time series under

-means and spectral clustering, showing that the window length relative to the series length drives qualitatively different failure modes. It provides theoretical bounds and constructions explaining why small windows yield flat centroids, why near-symmetric window arrangements produce sinusoidal centroids, and why large windows tend to produce interval-based clusters. The results combine probabilistic and spectral-analysis techniques, including PCA, Wedin's sinΘ theorem, and Grassmannian distance concepts, and are illustrated with real and synthetic data. Collectively, the findings inform how to interpret cluster structure in sliding-window representations and suggest cautions for choosing

and

in time-series clustering tasks.

On the clustering behavior of sliding windows

TL;DR

Abstract

On the clustering behavior of sliding windows

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (10)