Table of Contents
Fetching ...

Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases

Zhao Tan, Yiji Zhao, Shiyu Wang, Chang Xu, Yuxuan Liang, Xiping Liu, Shirui Pan, Ming Jin

TL;DR

This work presents the first systematic study of NLQ4TSDB, offering a general framework and evaluation standard to facilitate future research and demonstrates that Sonar-TS effectively navigates complex temporal queries where traditional methods fail.

Abstract

Natural Language Querying for Time Series Databases (NLQ4TSDB) aims to assist non-expert users retrieve meaningful events, intervals, and summaries from massive temporal records. However, existing Text-to-SQL methods are not designed for continuous morphological intents such as shapes or anomalies, while time series models struggle to handle ultra-long histories. To address these challenges, we propose Sonar-TS, a neuro-symbolic framework that tackles NLQ4TSDB via a Search-Then-Verify pipeline. Analogous to active sonar, it utilizes a feature index to ping candidate windows via SQL, followed by generated Python programs to lock on and verify candidates against raw signals. To enable effective evaluation, we introduce NLQTSBench, the first large-scale benchmark designed for NLQ over TSDB-scale histories. Our experiments highlight the unique challenges within this domain and demonstrate that Sonar-TS effectively navigates complex temporal queries where traditional methods fail. This work presents the first systematic study of NLQ4TSDB, offering a general framework and evaluation standard to facilitate future research.

Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases

TL;DR

This work presents the first systematic study of NLQ4TSDB, offering a general framework and evaluation standard to facilitate future research and demonstrates that Sonar-TS effectively navigates complex temporal queries where traditional methods fail.

Abstract

Natural Language Querying for Time Series Databases (NLQ4TSDB) aims to assist non-expert users retrieve meaningful events, intervals, and summaries from massive temporal records. However, existing Text-to-SQL methods are not designed for continuous morphological intents such as shapes or anomalies, while time series models struggle to handle ultra-long histories. To address these challenges, we propose Sonar-TS, a neuro-symbolic framework that tackles NLQ4TSDB via a Search-Then-Verify pipeline. Analogous to active sonar, it utilizes a feature index to ping candidate windows via SQL, followed by generated Python programs to lock on and verify candidates against raw signals. To enable effective evaluation, we introduce NLQTSBench, the first large-scale benchmark designed for NLQ over TSDB-scale histories. Our experiments highlight the unique challenges within this domain and demonstrate that Sonar-TS effectively navigates complex temporal queries where traditional methods fail. This work presents the first systematic study of NLQ4TSDB, offering a general framework and evaluation standard to facilitate future research.
Paper Structure (54 sections, 10 equations, 6 figures, 4 tables)

This paper contains 54 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Comparison of querying paradigms. While Text-to-SQL fails to express morphological intents and Time Series Models are limited by context length, Sonar-TS adopts a "Search-Then-Verify" pipeline: it uses SQL to search a symbolic index for candidates and Python to verify them on raw data.
  • Figure 2: The hierarchical taxonomy of tasks in NLQTSBench. The benchmark ranges from Level 1 (Basic Operations) which tests numerical filtering, to Level 2 (Pattern Recognition) for morphological grounding, Level 3 (Semantic Reasoning) for logical composition, and finally Level 4 (Insight Synthesis) for narrative reporting.
  • Figure 3: The overview of the Sonar-TS framework. The workflow is organized into three stages: (1) Offline Data Processing constructs compact multi-scale Feature Tables to serve as a queryable index; (2) Online Querying, where the Task Planner and Code Generator synthesize SQL for rapid candidate search and Python for exact verification, supported by a closed-loop Prompt Cold Start mechanism that evolves analysis insights; and (3) Post-processing translates execution artifacts into a user-friendly interface.
  • Figure 4: Case Study. Text-to-SQL (Left) lacks morphological expressivity, and TS Models (Right) fail the logical constraint. Sonar-TS (Middle) succeeds via Search-Then-Verify.
  • Figure 5: The human verification interface. Annotators inspect both global context and local details (where the injected signal in orange is overlaid on the raw data in blue) to validate the ground truth.
  • ...and 1 more figures