Retrieving Time-Series Differences Using Natural Language Queries
Kota Dohi, Tomoya Nishida, Harsh Purohit, Takashi Endo, Yohei Kawaguchi
TL;DR
The paper tackles retrieving time-series pairs based on specified differences described in natural language. It defines six difference characteristics, generates a large synthetic dataset paired with query texts, and proposes a contrastive learning framework that aligns reference–target time-series differences with textual queries in a shared embedding space. Through extensive experiments, the approach achieves an overall mAP of 0.994, demonstrating effective retrieval across all difference types. This work enables non-expert users to search for inter-series differences in time-series data, with strong implications for industrial monitoring and comparative analysis across sensors.
Abstract
Effectively searching time-series data is essential for system analysis; however, traditional methods often require domain expertise to define search criteria. Recent advancements have enabled natural language-based search, but these methods struggle to handle differences between time-series data. To address this limitation, we propose a natural language query-based approach for retrieving pairs of time-series data based on differences specified in the query. Specifically, we define six key characteristics of differences, construct a corresponding dataset, and develop a contrastive learning-based model to align differences between time-series data with query texts. Experimental results demonstrate that our model achieves an overall mAP score of 0.994 in retrieving time-series pairs.
