Evidence-Guided Schema Normalization for Temporal Tabular Reasoning
Ashish Thanga, Vibhu Dixit, Abhilash Shankarampeta, Vivek Gupta
TL;DR
Temporal reasoning over evolving semi-structured tables is hard for LLMs. The authors recast the problem as text-to-SQL by generating normalized 3NF schemas from infobox timelines, populating a database, and then generating SQL queries guided by the schema. They show that schema quality can outweigh model capacity, achieving 80.39 EM on TransientTables with a Gemini-based config and demonstrating strong cross-model portability. The work highlights the value of hybrid symbolic-systems approaches, where principled schema design and executable queries unlock robust temporal QA beyond raw JSON-table reasoning.
Abstract
Temporal reasoning over evolving semi-structured tables poses a challenge to current QA systems. We propose a SQL-based approach that involves (1) generating a 3NF schema from Wikipedia infoboxes, (2) generating SQL queries, and (3) query execution. Our central finding challenges model scaling assumptions: the quality of schema design has a greater impact on QA precision than model capacity. We establish three evidence-based principles: normalization that preserves context, semantic naming that reduces ambiguity, and consistent temporal anchoring. Our best configuration (Gemini 2.5 Flash schema + Gemini-2.0-Flash queries) achieves 80.39 EM, a 16.8\% improvement over the baseline (68.89 EM).
