Table of Contents
Fetching ...

$k$-Graph: A Graph Embedding for Interpretable Time Series Clustering

Paul Boniol, Donato Tiano, Angela Bonifati, Themis Palpanas

TL;DR

Graph outperforms current state-of-the-art time series clustering algorithms in accuracy, while providing users with meaningful explanations and interpretations of the clustering outcomes.

Abstract

Time series clustering poses a significant challenge with diverse applications across domains. A prominent drawback of existing solutions lies in their limited interpretability, often confined to presenting users with centroids. In addressing this gap, our work presents $k$-Graph, an unsupervised method explicitly crafted to augment interpretability in time series clustering. Leveraging a graph representation of time series subsequences, $k$-Graph constructs multiple graph representations based on different subsequence lengths. This feature accommodates variable-length time series without requiring users to predetermine subsequence lengths. Our experimental results reveal that $k$-Graph outperforms current state-of-the-art time series clustering algorithms in accuracy, while providing users with meaningful explanations and interpretations of the clustering outcomes.

$k$-Graph: A Graph Embedding for Interpretable Time Series Clustering

TL;DR

Graph outperforms current state-of-the-art time series clustering algorithms in accuracy, while providing users with meaningful explanations and interpretations of the clustering outcomes.

Abstract

Time series clustering poses a significant challenge with diverse applications across domains. A prominent drawback of existing solutions lies in their limited interpretability, often confined to presenting users with centroids. In addressing this gap, our work presents -Graph, an unsupervised method explicitly crafted to augment interpretability in time series clustering. Leveraging a graph representation of time series subsequences, -Graph constructs multiple graph representations based on different subsequence lengths. This feature accommodates variable-length time series without requiring users to predetermine subsequence lengths. Our experimental results reveal that -Graph outperforms current state-of-the-art time series clustering algorithms in accuracy, while providing users with meaningful explanations and interpretations of the clustering outcomes.

Paper Structure

This paper contains 24 sections, 2 theorems, 10 equations, 12 figures, 3 tables, 2 algorithms.

Key Result

Lemma 1

For a given clustering partition $C = \{C_1,C_2,...,C_k\}$, if $\lambda \leq k$, then $\bigcup_{C_i \in C} \mathcal{G}^{\lambda}_{C_i} = \mathcal{G}$. if $\lambda > 0.5$, then $\bigcap_{C_i \in C} \mathcal{G}^{\lambda}_{C_i} = \emptyset$.

Figures (12)

  • Figure 1: $k$-Graph resulting graph $\mathcal{G}$ when applied on the Trace dataset Dau2018TheUT
  • Figure 2: $\lambda$-Graphoids and $\gamma$-Graphoids for different $\lambda$ and $\gamma$.
  • Figure 3: $k$-Graph pipeline.
  • Figure 4: $W_c$ and $W_e$ for $k$-Graph (with $k=4$ and $M=10$) applied on the Trace dataset of the UCR-Archive.
  • Figure 5: Experimental comparison of $k$-Graph versus the baselines on the UCR archive. In (a), the mean values are represented as a white square. The horizontal red dotted line represents the best mean.
  • ...and 7 more figures

Theorems & Definitions (8)

  • Definition 1: Graph Embedding
  • Definition 2: $Graphoid$
  • Definition 3: Node representativity
  • Definition 4: Node Exclusivity
  • Definition 5: $\lambda$-$Graphoid$
  • Definition 6: $\gamma$-$Graphoid$
  • Lemma 1
  • Lemma 2