Table of Contents
Fetching ...

Relation-driven Query of Multiple Time Series

Shuhan Liu, Yuan Tian, Zikun Deng, Weiwei Cui, Haidong Zhang, Di Weng, Yingcai Wu

TL;DR

This work tackles relation-driven querying over multiple time series by identifying a heterogeneous set of relations and designing RelaQ, a fuzzy-interactive system with a three-stage pipeline (preprocessing, query formulation, and processing). It combines a data-structure–driven graph-matching model with a matrix/time-based visualization to support complex relation queries and exploration, complemented by on-demand guidance. A formative study defines relation scope and user requirements, while case studies and a user study demonstrate practical effectiveness and usability in domains like EEG analysis and urban air pollution. The results suggest RelaQ provides intuitive, scalable retrieval of complex relation patterns and lays groundwork for integrating intelligent pattern mining in future work, with attention to scalability and learnability considerations.

Abstract

Querying time series based on their relations is a crucial part of multiple time series analysis. By retrieving and understanding time series relations, analysts can easily detect anomalies and validate hypotheses in complex time series datasets. However, current relation extraction approaches, including knowledge- and data-driven ones, tend to be laborious and do not support heterogeneous relations. By conducting a formative study with 11 experts, we concluded 6 time series relations, including correlation, causality, similarity, lag, arithmetic, and meta, and summarized three pain points in querying time series involving these relations. We proposed RelaQ, an interactive system that supports the time series query via relation specifications. RelaQ allows users to intuitively specify heterogeneous relations when querying multiple time series, understand the query results based on a scalable, multi-level visualization, and explore possible relations beyond the existing queries. RelaQ is evaluated with two use cases and a user study with 12 participants, showing promising effectiveness and usability.

Relation-driven Query of Multiple Time Series

TL;DR

This work tackles relation-driven querying over multiple time series by identifying a heterogeneous set of relations and designing RelaQ, a fuzzy-interactive system with a three-stage pipeline (preprocessing, query formulation, and processing). It combines a data-structure–driven graph-matching model with a matrix/time-based visualization to support complex relation queries and exploration, complemented by on-demand guidance. A formative study defines relation scope and user requirements, while case studies and a user study demonstrate practical effectiveness and usability in domains like EEG analysis and urban air pollution. The results suggest RelaQ provides intuitive, scalable retrieval of complex relation patterns and lays groundwork for integrating intelligent pattern mining in future work, with attention to scalability and learnability considerations.

Abstract

Querying time series based on their relations is a crucial part of multiple time series analysis. By retrieving and understanding time series relations, analysts can easily detect anomalies and validate hypotheses in complex time series datasets. However, current relation extraction approaches, including knowledge- and data-driven ones, tend to be laborious and do not support heterogeneous relations. By conducting a formative study with 11 experts, we concluded 6 time series relations, including correlation, causality, similarity, lag, arithmetic, and meta, and summarized three pain points in querying time series involving these relations. We proposed RelaQ, an interactive system that supports the time series query via relation specifications. RelaQ allows users to intuitively specify heterogeneous relations when querying multiple time series, understand the query results based on a scalable, multi-level visualization, and explore possible relations beyond the existing queries. RelaQ is evaluated with two use cases and a user study with 12 participants, showing promising effectiveness and usability.
Paper Structure (27 sections, 1 equation, 19 figures, 1 table)

This paper contains 27 sections, 1 equation, 19 figures, 1 table.

Figures (19)

  • Figure 1: An example of a relation-driven query. An analyst observed a global positive-correlated pair of time series (A and B) and sought local negative-correlated fragments. Finally, the result time fragments were highlighted.
  • Figure 2: The user interface of RelaQ. It is composed of three main parts: In the (A) input panel, users can sketch trends and specify relations. In the (B) result panel, users can obtain an overview, compare results in the matrix view, and inspect details in the time view. In the (C) guidance panel, there are recommended timeboxes. Besides, this figure also displays part of the first case in Sec. \ref{['sec:case1']}: (D) an example query and (E-F) some patterns in its results.
  • Figure 3: Investigation responses from time series analysts. (A) Typical time series analysis tasks that participants work on, all nine types of tasks are covered. (B) The scale of time series data that participants usually deal with, the number of variables$\times$ the length of a single time series. (C) Frequency that analysts analyze time series in a week, reflecting all participants are very familiar with time series data.
  • Figure 4: The overview of RelaQ's workflow. Users upload a multiple time series dataset and a configuration file to RelaQ, which preprocesses the data, builds various indexes, and allows users to specify queries. The system can also provide guidance by recommending query constraints.
  • Figure 5: This shows a three-step data preprocessing method using an example. The raw data contains multiple time series (SF, LA, ..., FR) and label descriptions (SF's State being CA). (A) First, time series are compressed by taking the average of segments with Sampling Length=4 so the blue line shows compressed data. (B) Second, relation indexes are computed. We compute the relation strength between each pair of time series and record all time series names in the descending order of relation strength on the whole length. e.g., SA is the city with the highest similarity strength with SF. (C) Third, trend indexes are computed. We transformed compressed data into symbolic sequences using SAX sax (alphabet size = 4). The sequence (ZXWWXY) is built as a trie (depth = 8/4 = 2, box length = 8, sampling length = 4). e.g., The starting points of ZX contain [0], and the ratio is 0.2.
  • ...and 14 more figures