Text-to-TrajVis: Enabling Trajectory Data Visualizations from Natural Language Questions
Tian Bai, Huiyan Ying, Kailong Suo, Junqiu Wei, Tao Fan, Yuanfeng Song
TL;DR
The paper defines Text-to-TrajVis, a novel NL2VIS task that translates natural language trajectory queries into visualizations via a new Trajectory Visualization Language ($TVL$). It introduces TrajVL, the first large-scale benchmark for this task, built through a seed-and-augment TVL generation process and LLM-assisted NLQ labeling, yielding 6,988 TVLs and 18,140 (question,TVL) pairs. Systematic evaluations of GPT, Qwen, Llama, and other LLMs (including RAG setups) reveal that while the task is feasible, it remains challenging, particularly for complex spatio-temporal queries; RAG improves performance but gaps persist. The work provides a concrete dataset, a formal TVL-to-SQL translation flow, and baseline results to spur further research in trajectory-aware visual analytics and NL-based visualization interfaces. Overall, TrajVL advances the TrajVis field by enabling end-to-end NL-to-visualization workflows and highlighting key research directions for robust spatio-temporal interpretation by LLMs.
Abstract
This paper introduces the Text-to-TrajVis task, which aims to transform natural language questions into trajectory data visualizations, facilitating the development of natural language interfaces for trajectory visualization systems. As this is a novel task, there is currently no relevant dataset available in the community. To address this gap, we first devised a new visualization language called Trajectory Visualization Language (TVL) to facilitate querying trajectory data and generating visualizations. Building on this foundation, we further proposed a dataset construction method that integrates Large Language Models (LLMs) with human efforts to create high-quality data. Specifically, we first generate TVLs using a comprehensive and systematic process, and then label each TVL with corresponding natural language questions using LLMs. This process results in the creation of the first large-scale Text-to-TrajVis dataset, named TrajVL, which contains 18,140 (question, TVL) pairs. Based on this dataset, we systematically evaluated the performance of multiple LLMs (GPT, Qwen, Llama, etc.) on this task. The experimental results demonstrate that this task is both feasible and highly challenging and merits further exploration within the research community.
