Table of Contents
Fetching ...

DRAMA: Unifying Data Retrieval and Analysis for Open-Domain Analytic Queries

Chuxuan Hu, Maxwell Yang, James Weiland, Yeji Lim, Suhas Palawala, Daniel Kang

TL;DR

Drama presents an end-to-end paradigm for open-domain analytic queries by unifying data collection, transformation, and analysis. The three-stage formulation—Data Collection: $collect(Q) \rightarrow D$, Data Transformation: $transform(Q, D) \rightarrow T$, and Data Analysis: $analyze(Q, T) \rightarrow A$—is instantiated in DramaBot, a two-agent system coordinating a data retriever and a data analyzer. DramaBench provides 200 real-world tasks (100 claim verification and 100 QA) that require up-to-date data collection and structured reasoning, enabling rigorous evaluation of data-grounded performance. On DramaBench, DramaBot achieves an overall accuracy of $86.5\%$ at a cost of $\$0.05$ per task and outperforms five strong baselines by up to $6.9\times$, demonstrating robust, scalable, data-grounded analytic reasoning in practice.

Abstract

Manually conducting real-world data analyses is labor-intensive and inefficient. Despite numerous attempts to automate data science workflows, none of the existing paradigms or systems fully demonstrate all three key capabilities required to support them effectively: (1) open-domain data collection, (2) structured data transformation, and (3) analytic reasoning. To overcome these limitations, we propose DRAMA, an end-to-end paradigm that answers users' analytic queries in natural language on large-scale open-domain data. DRAMA unifies data collection, transformation, and analysis as a single pipeline. To quantitatively evaluate system performance on tasks representative of DRAMA, we construct a benchmark, DRAMA-Bench, consisting of two categories of tasks: claim verification and question answering, each comprising 100 instances. These tasks are derived from real-world applications that have gained significant public attention and require the retrieval and analysis of open-domain data. We develop DRAMA-Bot, a multi-agent system designed following DRAMA. It comprises a data retriever that collects and transforms data by coordinating the execution of sub-agents, and a data analyzer that performs structured reasoning over the retrieved data. We evaluate DRAMA-Bot on DRAMA-Bench together with five state-of-the-art baseline agents. DRAMA-Bot achieves 86.5% task accuracy at a cost of $0.05, outperforming all baselines with up to 6.9 times the accuracy and less than 1/6 of the cost. DRAMA is publicly available at https://github.com/uiuc-kang-lab/drama.

DRAMA: Unifying Data Retrieval and Analysis for Open-Domain Analytic Queries

TL;DR

Drama presents an end-to-end paradigm for open-domain analytic queries by unifying data collection, transformation, and analysis. The three-stage formulation—Data Collection: , Data Transformation: , and Data Analysis: —is instantiated in DramaBot, a two-agent system coordinating a data retriever and a data analyzer. DramaBench provides 200 real-world tasks (100 claim verification and 100 QA) that require up-to-date data collection and structured reasoning, enabling rigorous evaluation of data-grounded performance. On DramaBench, DramaBot achieves an overall accuracy of at a cost of 0.056.9\times$, demonstrating robust, scalable, data-grounded analytic reasoning in practice.

Abstract

Manually conducting real-world data analyses is labor-intensive and inefficient. Despite numerous attempts to automate data science workflows, none of the existing paradigms or systems fully demonstrate all three key capabilities required to support them effectively: (1) open-domain data collection, (2) structured data transformation, and (3) analytic reasoning. To overcome these limitations, we propose DRAMA, an end-to-end paradigm that answers users' analytic queries in natural language on large-scale open-domain data. DRAMA unifies data collection, transformation, and analysis as a single pipeline. To quantitatively evaluate system performance on tasks representative of DRAMA, we construct a benchmark, DRAMA-Bench, consisting of two categories of tasks: claim verification and question answering, each comprising 100 instances. These tasks are derived from real-world applications that have gained significant public attention and require the retrieval and analysis of open-domain data. We develop DRAMA-Bot, a multi-agent system designed following DRAMA. It comprises a data retriever that collects and transforms data by coordinating the execution of sub-agents, and a data analyzer that performs structured reasoning over the retrieved data. We evaluate DRAMA-Bot on DRAMA-Bench together with five state-of-the-art baseline agents. DRAMA-Bot achieves 86.5% task accuracy at a cost of $0.05, outperforming all baselines with up to 6.9 times the accuracy and less than 1/6 of the cost. DRAMA is publicly available at https://github.com/uiuc-kang-lab/drama.

Paper Structure

This paper contains 24 sections, 1 equation, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Drama integrates the full data science pipeline, where (C1)–(C3) correspond to the essential capabilities underpinning each stage. Here, "Data Retrieval" refers to the collection and structuring of raw data from open domains.
  • Figure 2: Overview of the Drama paradigm. Here we present two examples: (left) user query as a question ($Q_1$), and (right) user query as a claim to be verified ($Q_2$).
  • Figure 3: Overview of each DramaBench task. Given a user query, the agent is tasked with collecting, structuring, and analyzing data from open domains to generate an answer.
  • Figure 4: Overview of DramaBot.
  • Figure 5: The overall accuracy (%) of different agents across time. Each labeled time point marks the start of a three-month period.