nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow
Geliang Ouyang, Jingyao Chen, Zhihe Nie, Yi Gui, Yao Wan, Hongyu Zhang, Dongping Chen
TL;DR
This work tackles NL2Vis, the challenge of translating natural-language queries into accurate visualizations over multi-table databases. It introduces nvAgent, a collaborative agent workflow comprising a processor, a composer, and a validator to preprocess schema, plan VQL sketches via step-by-step reasoning, and validate/executeby execution, respectively. By evaluating on the VisEval benchmark, nvAgent achieves state-of-the-art pass rates and quality, with notable gains in both single- and multi-table scenarios, and provides extensive ablation and qualitative analyses to justify the design. The study also analyzes backbone effects, prompting strategies, and error sources (notably temporal data handling), and discusses limitations and avenues for future enhancement, including open-source tooling and retrieval-augmented methods. Overall, nvAgent demonstrates robust, scalable NL2Vis performance across heterogeneous data sources and task complexities, offering a practical, end-to-end solution for automated data visualization from natural language.
Abstract
Natural Language to Visualization (NL2Vis) seeks to convert natural-language descriptions into visual representations of given tables, empowering users to derive insights from large-scale data. Recent advancements in Large Language Models (LLMs) show promise in automating code generation to transform tabular data into accessible visualizations. However, they often struggle with complex queries that require reasoning across multiple tables. To address this limitation, we propose a collaborative agent workflow, termed nvAgent, for NL2Vis. Specifically, nvAgent comprises three agents: a processor agent for database processing and context filtering, a composer agent for planning visualization generation, and a validator agent for code translation and output verification. Comprehensive evaluations on the new VisEval benchmark demonstrate that nvAgent consistently surpasses state-of-the-art baselines, achieving a 7.88% improvement in single-table and a 9.23% improvement in multi-table scenarios. Qualitative analyses further highlight that nvAgent maintains nearly a 20% performance margin over previous models, underscoring its capacity to produce high-quality visual representations from complex, heterogeneous data sources.
