Table of Contents
Fetching ...

Persistent homology of featured time series data and its applications

Eunwoo Heo, Jae-Hun Jung

TL;DR

This work extends persistent homology for time series by introducing featured time series data and influence vectors, allowing domain-informed customization of graph representations and PH computations. It proves a stability theorem ensuring that small changes in the influence vector lead to small changes in persistence diagrams, enhancing robustness. The approach is validated through stock-market anomaly detection and musical data analysis, demonstrating improved preservation of domain-relevant features and flexible interpretation. Overall, the method offers a principled, adaptable PH framework for diverse time-series domains with practical implications for monitoring and analysis.

Abstract

Recent studies have actively employed persistent homology (PH), a topological data analysis technique, to analyze the topological information in time series data. Many successful studies have utilized graph representations of time series data for PH calculation. Given the diverse nature of time series data, it is crucial to have mechanisms that can adjust the PH calculations by incorporating domain-specific knowledge. In this context, we introduce a methodology that allows the adjustment of PH calculations by reflecting relevant domain knowledge in specific fields. We introduce the concept of featured time series, which is the pair of a time series augmented with specific features such as domain knowledge, and an influence vector that assigns a value to each feature to fine-tune the results of the PH. We then prove the stability theorem of the proposed method, which states that adjusting the influence vectors grants stability to the PH calculations. The proposed approach enables the tailored analysis of a time series based on the graph representation methodology, which makes it applicable to real-world domains. We consider two examples to verify the proposed method's advantages: anomaly detection of stock data and topological analysis of music data.

Persistent homology of featured time series data and its applications

TL;DR

This work extends persistent homology for time series by introducing featured time series data and influence vectors, allowing domain-informed customization of graph representations and PH computations. It proves a stability theorem ensuring that small changes in the influence vector lead to small changes in persistence diagrams, enhancing robustness. The approach is validated through stock-market anomaly detection and musical data analysis, demonstrating improved preservation of domain-relevant features and flexible interpretation. Overall, the method offers a principled, adaptable PH framework for diverse time-series domains with practical implications for monitoring and analysis.

Abstract

Recent studies have actively employed persistent homology (PH), a topological data analysis technique, to analyze the topological information in time series data. Many successful studies have utilized graph representations of time series data for PH calculation. Given the diverse nature of time series data, it is crucial to have mechanisms that can adjust the PH calculations by incorporating domain-specific knowledge. In this context, we introduce a methodology that allows the adjustment of PH calculations by reflecting relevant domain knowledge in specific fields. We introduce the concept of featured time series, which is the pair of a time series augmented with specific features such as domain knowledge, and an influence vector that assigns a value to each feature to fine-tune the results of the PH. We then prove the stability theorem of the proposed method, which states that adjusting the influence vectors grants stability to the PH calculations. The proposed approach enables the tailored analysis of a time series based on the graph representation methodology, which makes it applicable to real-world domains. We consider two examples to verify the proposed method's advantages: anomaly detection of stock data and topological analysis of music data.
Paper Structure (19 sections, 7 theorems, 42 equations, 12 figures)

This paper contains 19 sections, 7 theorems, 42 equations, 12 figures.

Key Result

Proposition 3.1

Let $\widehat{d}$ be the distance as defined in Definition def:distance_weightedgraph. Then, $\widehat{d}$ is a metric.

Figures (12)

  • Figure 1: A time series of temperature with the anomaly region shaded.
  • Figure 2: Weighted graph $G=(V,E,W_E)$ of the time series $T$ in Figure \ref{['fig:Time_temperature_nofeat']} (left) and the corresponding distance matrix (right).
  • Figure 3: Rips filtration (Top) and its persistence barcode (Bottom).
  • Figure 4: Schematic illustrations of the weighted graph $\widehat{G}^g=(V, E, \widehat{W}_V, \widehat{W}_E)$ for various influence vectors $g$. (i) represents the case where $\widehat{W}_V = 0$ and $\widehat{W}_E = W_E$, (ii) shows the changes when only the edge weight is varied, and (iii) illustrates the changes in the vertex weights from those in (ii).
  • Figure 5: A time series $\widehat{T}$ with features added from the original time series $T$ shown in Figure \ref{['fig:Time_temperature_nofeat']}.
  • ...and 7 more figures

Theorems & Definitions (16)

  • Definition 2.1: Distance
  • Definition 3.1: Feature set
  • Definition 3.2: Influence vectors
  • Definition 3.3: Featured time series
  • Definition 3.4: Distance
  • Proposition 3.1
  • proof
  • Corollary 3.1
  • Proposition 3.2
  • proof
  • ...and 6 more