Thucy: An LLM-based Multi-Agent System for Claim Verification across Relational Databases
Michael Theologitis, Dan Suciu
TL;DR
Thucy addresses the challenge of verifying NL claims against structured data by enabling cross-database, cross-table verification through a decoupled multi-agent architecture. It employs a Verifier that coordinates Data, Schema, and SQL experts, all connected to relational data via MCP tools, delivering verified verdicts accompanied by exact SQL evidence. On TabFact, Thucy achieves 94.3% accuracy, surpassing prior state-of-the-art while remaining agnostic to data sources and capable of handling multiple databases. The work demonstrates a scalable, transparent approach to claim verification with end-to-end SQL traceability, offering practical utility for journalism, policy analysis, and data-driven fact-checking.
Abstract
In today's age, it is becoming increasingly difficult to decipher truth from lies. Every day, politicians, media outlets, and public figures make conflicting claims$\unicode{x2014}$often about topics that can, in principle, be verified against structured data. For instance, statements about crime rates, economic growth or healthcare can all be verified against official public records and structured datasets. Building a system that can automatically do that would have sounded like science fiction just a few years ago. Yet, with the extraordinary progress in LLMs and agentic AI, this is now within reach. Still, there remains a striking gap between what is technically possible and what is being demonstrated by recent work. Most existing verification systems operate only on small, single-table databases$\unicode{x2014}$typically a few hundred rows$\unicode{x2014}$that conveniently fit within an LLM's context window. In this paper we report our progress on Thucy, the first cross-database, cross-table multi-agent claim verification system that also provides concrete evidence for each verification verdict. Thucy remains completely agnostic to the underlying data sources before deployment and must therefore autonomously discover, inspect, and reason over all available relational databases to verify claims. Importantly, Thucy also reports the exact SQL queries that support its verdict (whether the claim is accurate or not) offering full transparency to expert users familiar with SQL. When evaluated on the TabFact dataset$\unicode{x2014}$the standard benchmark for fact verification over structured data$\unicode{x2014}$Thucy surpasses the previous state of the art by 5.6 percentage points in accuracy (94.3% vs. 88.7%).
