Table of Contents
Fetching ...

Retrieving Time-Series Differences Using Natural Language Queries

Kota Dohi, Tomoya Nishida, Harsh Purohit, Takashi Endo, Yohei Kawaguchi

TL;DR

The paper tackles retrieving time-series pairs based on specified differences described in natural language. It defines six difference characteristics, generates a large synthetic dataset paired with query texts, and proposes a contrastive learning framework that aligns reference–target time-series differences with textual queries in a shared embedding space. Through extensive experiments, the approach achieves an overall mAP of 0.994, demonstrating effective retrieval across all difference types. This work enables non-expert users to search for inter-series differences in time-series data, with strong implications for industrial monitoring and comparative analysis across sensors.

Abstract

Effectively searching time-series data is essential for system analysis; however, traditional methods often require domain expertise to define search criteria. Recent advancements have enabled natural language-based search, but these methods struggle to handle differences between time-series data. To address this limitation, we propose a natural language query-based approach for retrieving pairs of time-series data based on differences specified in the query. Specifically, we define six key characteristics of differences, construct a corresponding dataset, and develop a contrastive learning-based model to align differences between time-series data with query texts. Experimental results demonstrate that our model achieves an overall mAP score of 0.994 in retrieving time-series pairs.

Retrieving Time-Series Differences Using Natural Language Queries

TL;DR

The paper tackles retrieving time-series pairs based on specified differences described in natural language. It defines six difference characteristics, generates a large synthetic dataset paired with query texts, and proposes a contrastive learning framework that aligns reference–target time-series differences with textual queries in a shared embedding space. Through extensive experiments, the approach achieves an overall mAP of 0.994, demonstrating effective retrieval across all difference types. This work enables non-expert users to search for inter-series differences in time-series data, with strong implications for industrial monitoring and comparative analysis across sensors.

Abstract

Effectively searching time-series data is essential for system analysis; however, traditional methods often require domain expertise to define search criteria. Recent advancements have enabled natural language-based search, but these methods struggle to handle differences between time-series data. To address this limitation, we propose a natural language query-based approach for retrieving pairs of time-series data based on differences specified in the query. Specifically, we define six key characteristics of differences, construct a corresponding dataset, and develop a contrastive learning-based model to align differences between time-series data with query texts. Experimental results demonstrate that our model achieves an overall mAP score of 0.994 in retrieving time-series pairs.

Paper Structure

This paper contains 16 sections, 16 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Overview of the model architecture for aligning time-series data with query texts.
  • Figure 2: t-SNE plots of query text embeddings. Left: Embeddings generated by the pre-trained BART-Large-XSum model. Right: Embeddings generated by the fine-tuned BART-Large-XSum model.