SQuARE: Structured Query & Adaptive Retrieval Engine For Tabular Formats
Chinmay Gondhalekar, Urjitkumar Patel, Fang-Chun Yeh
TL;DR
SQuARE addresses the challenge of question answering over real spreadsheets by introducing an adaptive retrieval framework that routes queries between structure-preserving semantic chunks for complex, multi-header sheets and constrained SQL over an inferred relational view for flat tables. A sheet-level complexity score guides routing, while a lightweight agent enables confidence-based fallback and evidence fusion, ensuring verifiable cell/row-level provenance. Across complex balance sheets, a merged World Bank workbook, and flat tables, SQuARE outperforms static baselines and tool-free ChatGPT-4o in end-to-end accuracy and retrieval fidelity, with predictable latency on modest hardware. This work demonstrates that decoupling retrieval strategy from model choice and maintaining structural fidelity are practical, scalable steps toward robust tabular question answering and compatibility with emerging tabular foundation models.
Abstract
Accurate question answering over real spreadsheets remains difficult due to multirow headers, merged cells, and unit annotations that disrupt naive chunking, while rigid SQL views fail on files lacking consistent schemas. We present SQuARE, a hybrid retrieval framework with sheet-level, complexity-aware routing. It computes a continuous score based on header depth and merge density, then routes queries either through structure-preserving chunk retrieval or SQL over an automatically constructed relational representation. A lightweight agent supervises retrieval, refinement, or combination of results across both paths when confidence is low. This design maintains header hierarchies, time labels, and units, ensuring that returned values are faithful to the original cells and straightforward to verify. Evaluated on multi-header corporate balance sheets, a heavily merged World Bank workbook, and diverse public datasets, SQuARE consistently surpasses single-strategy baselines and ChatGPT-4o on both retrieval precision and end-to-end answer accuracy while keeping latency predictable. By decoupling retrieval from model choice, the system is compatible with emerging tabular foundation models and offers a practical bridge toward a more robust table understanding.
