TableMaster: A Recipe to Advance Table Understanding with Language Models
Lang Cao, Hanbing Liu
TL;DR
TableMaster introduces a unified recipe for table understanding with language models, addressing four core challenges: locating target data, semantic deficiency, numerical inaccuracy, and symbolic reasoning rigidity. It combines table structure understanding (table-of-focus), table content understanding (verbalization and reconstruction), and adaptive reasoning (textual and text-guided symbolic) to enable robust, scalable QA over tabular data. Across WikiTQ, TabFact, and FetaQA, TableMaster achieves state-of-the-art results and demonstrates consistent gains across backbones, with ablations showing the critical importance of structure, content, and especially textual reasoning. The framework is designed to be broadly applicable to web tables, spreadsheets, and databases, offering a practical, efficient approach for real-world table reasoning tasks with LMs.
Abstract
Tables serve as a fundamental format for representing structured relational data. While current language models (LMs) excel at many text-based tasks, they still face challenges in table understanding due to the complex characteristics of tabular data, such as their structured nature. In this paper, we aim to enhance LMs for improved table understanding. We identify four key challenges: 1) difficulty in locating target data, 2) deficiency in table semantics, 3) numerical inaccuracies in textual reasoning, and 4) semantic inflexibility in symbolic reasoning. To address these issues, we propose TableMaster, a recipe and comprehensive framework that integrates multiple solutions to overcome these obstacles. TableMaster first extracts relevant table content and verbalizes it with enriched semantic context. Additionally, we introduce adaptive reasoning, a flexible approach that dynamically adjusts between textual and symbolic reasoning, tailoring the reasoning process to each query. Extensive analyses and experiments demonstrate our findings and the effectiveness of TableMaster. On the WikiTQ dataset, TableMaster achieves an accuracy of 78.13% using GPT-4o-mini, surpassing existing baselines.
