Table of Contents
Fetching ...

Thucy: An LLM-based Multi-Agent System for Claim Verification across Relational Databases

Michael Theologitis, Dan Suciu

TL;DR

Thucy addresses the challenge of verifying NL claims against structured data by enabling cross-database, cross-table verification through a decoupled multi-agent architecture. It employs a Verifier that coordinates Data, Schema, and SQL experts, all connected to relational data via MCP tools, delivering verified verdicts accompanied by exact SQL evidence. On TabFact, Thucy achieves 94.3% accuracy, surpassing prior state-of-the-art while remaining agnostic to data sources and capable of handling multiple databases. The work demonstrates a scalable, transparent approach to claim verification with end-to-end SQL traceability, offering practical utility for journalism, policy analysis, and data-driven fact-checking.

Abstract

In today's age, it is becoming increasingly difficult to decipher truth from lies. Every day, politicians, media outlets, and public figures make conflicting claims$\unicode{x2014}$often about topics that can, in principle, be verified against structured data. For instance, statements about crime rates, economic growth or healthcare can all be verified against official public records and structured datasets. Building a system that can automatically do that would have sounded like science fiction just a few years ago. Yet, with the extraordinary progress in LLMs and agentic AI, this is now within reach. Still, there remains a striking gap between what is technically possible and what is being demonstrated by recent work. Most existing verification systems operate only on small, single-table databases$\unicode{x2014}$typically a few hundred rows$\unicode{x2014}$that conveniently fit within an LLM's context window. In this paper we report our progress on Thucy, the first cross-database, cross-table multi-agent claim verification system that also provides concrete evidence for each verification verdict. Thucy remains completely agnostic to the underlying data sources before deployment and must therefore autonomously discover, inspect, and reason over all available relational databases to verify claims. Importantly, Thucy also reports the exact SQL queries that support its verdict (whether the claim is accurate or not) offering full transparency to expert users familiar with SQL. When evaluated on the TabFact dataset$\unicode{x2014}$the standard benchmark for fact verification over structured data$\unicode{x2014}$Thucy surpasses the previous state of the art by 5.6 percentage points in accuracy (94.3% vs. 88.7%).

Thucy: An LLM-based Multi-Agent System for Claim Verification across Relational Databases

TL;DR

Thucy addresses the challenge of verifying NL claims against structured data by enabling cross-database, cross-table verification through a decoupled multi-agent architecture. It employs a Verifier that coordinates Data, Schema, and SQL experts, all connected to relational data via MCP tools, delivering verified verdicts accompanied by exact SQL evidence. On TabFact, Thucy achieves 94.3% accuracy, surpassing prior state-of-the-art while remaining agnostic to data sources and capable of handling multiple databases. The work demonstrates a scalable, transparent approach to claim verification with end-to-end SQL traceability, offering practical utility for journalism, policy analysis, and data-driven fact-checking.

Abstract

In today's age, it is becoming increasingly difficult to decipher truth from lies. Every day, politicians, media outlets, and public figures make conflicting claimsoften about topics that can, in principle, be verified against structured data. For instance, statements about crime rates, economic growth or healthcare can all be verified against official public records and structured datasets. Building a system that can automatically do that would have sounded like science fiction just a few years ago. Yet, with the extraordinary progress in LLMs and agentic AI, this is now within reach. Still, there remains a striking gap between what is technically possible and what is being demonstrated by recent work. Most existing verification systems operate only on small, single-table databasestypically a few hundred rowsthat conveniently fit within an LLM's context window. In this paper we report our progress on Thucy, the first cross-database, cross-table multi-agent claim verification system that also provides concrete evidence for each verification verdict. Thucy remains completely agnostic to the underlying data sources before deployment and must therefore autonomously discover, inspect, and reason over all available relational databases to verify claims. Importantly, Thucy also reports the exact SQL queries that support its verdict (whether the claim is accurate or not) offering full transparency to expert users familiar with SQL. When evaluated on the TabFact datasetthe standard benchmark for fact verification over structured dataThucy surpasses the previous state of the art by 5.6 percentage points in accuracy (94.3% vs. 88.7%).

Paper Structure

This paper contains 22 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The architecture of Thucy, a multi-agent system led by the Verifier. Its job is to verify NL claims grounded in relational databases and report the corresponding SQL evidence. The Verifier coordinates three expert agents: the Data Expert, which summarizes available data sources; the Schema Expert, which answers schema-related questions; and the SQL Expert, which writes and executes SQL queries to obtain verifiable answers. The data layer follows a plug-and-play design and can include any number of relational databases---each potentially containing many tables---with PostgreSQL, MySQL, SQL Server, and Oracle shown here only as examples. Thucy remains fully agnostic to the underlying data sources. The agents must therefore operate in an open-ended environment, discovering and reasoning about available data as they encounter it. The experts interact with these relational databases through specialized tools managed via Google's MCP Toolbox. Adding or removing databases is straightforward: it simple involves adding or removing the corresponding tool from the toolbox.
  • Figure 2: A YAML fragment showing the configuration of database tools; schema-related tools are omitted for brevity.
  • Figure 3: Example YAML configuration of toolsets
  • Figure 4: Inputs and outputs of the three expert-agents. The Data Expert is invoked without input and returns a concise high-level report of what the connected databases appear to contain. The Schema Expert expects a schema-related question along with a short hint about where to look (for example, "NYC database"), and returns a precise answer to that question. Lastly, the SQL Expert expects a question about the data along with specific information of the schema that is relevant to the question. The process usually involves a couple of SQL queries on the databases, depending on what the agent decides. At some point, it returns a clear answer together with the concrete SQL commands that verify it.
  • Figure 5: SQL query and results produced by Thucy when verifying the City Attorney's claim. The query groups crimes by year and category and counts total incidents.