Table of Contents
Fetching ...

Discovering Leitmotifs in Multidimensional Time Series

Patrick Schäfer, Ulf Leser

TL;DR

The experimental evaluation on a novel ground-truth annotated benchmark of 14 distinct real-life data sets shows that LAMA, when compared to four state-of-the-art baselines, shows superior performance in detecting meaningful patterns without increased computational complexity.

Abstract

A leitmotif is a recurring theme in literature, movies or music that carries symbolic significance for the piece it is contained in. When this piece can be represented as a multi-dimensional time series (MDTS), such as acoustic or visual observations, finding a leitmotif is equivalent to the pattern discovery problem, which is an unsupervised and complex problem in time series analytics. Compared to the univariate case, it carries additional complexity because patterns typically do not occur in all dimensions but only in a few - which are, however, unknown and must be detected by the method itself. In this paper, we present the novel, efficient and highly effective leitmotif discovery algorithm LAMA for MDTS. LAMA rests on two core principals: (a) a leitmotif manifests solely given a yet unknown number of sub-dimensions - neither too few, nor too many, and (b) the set of sub-dimensions are not independent from the best pattern found therein, necessitating both problems to be approached in a joint manner. In contrast to most previous methods, LAMA tackles both problems jointly - instead of independently selecting dimensions (or leitmotifs) and finding the best leitmotifs (or dimensions). Our experimental evaluation on a novel ground-truth annotated benchmark of 14 distinct real-life data sets shows that LAMA, when compared to four state-of-the-art baselines, shows superior performance in detecting meaningful patterns without increased computational complexity.

Discovering Leitmotifs in Multidimensional Time Series

TL;DR

The experimental evaluation on a novel ground-truth annotated benchmark of 14 distinct real-life data sets shows that LAMA, when compared to four state-of-the-art baselines, shows superior performance in detecting meaningful patterns without increased computational complexity.

Abstract

A leitmotif is a recurring theme in literature, movies or music that carries symbolic significance for the piece it is contained in. When this piece can be represented as a multi-dimensional time series (MDTS), such as acoustic or visual observations, finding a leitmotif is equivalent to the pattern discovery problem, which is an unsupervised and complex problem in time series analytics. Compared to the univariate case, it carries additional complexity because patterns typically do not occur in all dimensions but only in a few - which are, however, unknown and must be detected by the method itself. In this paper, we present the novel, efficient and highly effective leitmotif discovery algorithm LAMA for MDTS. LAMA rests on two core principals: (a) a leitmotif manifests solely given a yet unknown number of sub-dimensions - neither too few, nor too many, and (b) the set of sub-dimensions are not independent from the best pattern found therein, necessitating both problems to be approached in a joint manner. In contrast to most previous methods, LAMA tackles both problems jointly - instead of independently selecting dimensions (or leitmotifs) and finding the best leitmotifs (or dimensions). Our experimental evaluation on a novel ground-truth annotated benchmark of 14 distinct real-life data sets shows that LAMA, when compared to four state-of-the-art baselines, shows superior performance in detecting meaningful patterns without increased computational complexity.

Paper Structure

This paper contains 26 sections, 11 equations, 14 figures, 2 tables, 2 algorithms.

Figures (14)

  • Figure 1: The Shire theme played by the Lord of the Rings symphonic orchestra. The suite opens and ends with the Hobbit leitmotif, which is played on a solo tin whistle. Position of leitmotif found by LAMA (brown) compared to the ground truth (gray/bottom) and four competitor approaches (orange, green, red and purple). (Image best viewed in color)
  • Figure 2: Depicted are three sets (blue, red, orange), centered around a query (green point) and its $2$-NN. Leitmotif discovery involves two steps: (a) $2$-NN search around each query subsequence, and (b) determine the extent of each set (red arrow), i.e. $d_1, d_2, d_3$. Finally, the top Leitmotif with smallest extent $d_2$ is returned.
  • Figure 3: Dimensionality Reduction Workflow used in LAMA. Top: a query sequence is selected. (a) For each of query sequence, a k-NN search is performed along each dimension. (b) The dimensions get sorted by the distance of the k-th NN. (c) The first $f$ dimensions are kept for this sequence.
  • Figure 4: The figure shows three boxing routines A, B, C (each from left to right), which are unintuitively considered similar when independently optimizing/selecting dimensions between pairs (compare Eq. \ref{['eq:independent']}). While A and C show a punching motion, in B the actor simply turns his body around his own axis making small steps. (A, B) and (B, C) show similarity in steps, while (A, C) is similar in the punching motion. (Image best viewed in color)
  • Figure 5: The EF is a function of the cardinality of motif set to its extent. Elbow points represent large changes in similarity of the found motif, indicative of a concept change.
  • ...and 9 more figures

Theorems & Definitions (11)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Definition 8
  • Definition 9
  • Definition 10
  • ...and 1 more