Talking with Tables for Better LLM Factual Data Interactions
Jio Oh, Geon Heo, Seungjun Oh, Hyunjin Kim, JinYeong Bak, Jindong Wang, Xing Xie, Steven Euijong Whang
TL;DR
The paper investigates how data structuring affects LLM factual data interactions, showing that tabular representations yield superior accuracy, robustness, and token efficiency compared with natural text, JSON, and knowledge graphs. It introduces a comprehensive Evaluation Framework comparing multiple formats across three interaction scenarios and six task types, with extensive ablations demonstrating the conditions under which tables help most. Mechanistic analyses reveal that tabular inputs steer model attention toward relevant attributes and schemas, explaining performance gains, especially in sparse data settings. The findings suggest that Thinking with Tables is a practical, scalable approach for real-world data analytics with LLMs, and point to extensions toward partial structuring and integration with non-tabular data sources for broader applicability.
Abstract
Large Language Models (LLMs) often struggle with requests related to information retrieval and data manipulation that frequently arise in real-world scenarios under multiple conditions. In this paper, we demonstrate that leveraging tabular structures in LLM interactions, is more effective than utilizing other structures for handling prevalent requests that operate over factual data. Through comprehensive evaluations across various scenarios and request types, we show that providing tabular structures yields a 40.29\% average performance gain along with better robustness and token efficiency. Through attention-value analysis, we discover that tables help LLMs better locate relevant information, explaining these improvements. Beyond tables and text, we evaluate whether (1) blending structuredness within text, such as providing templates or fixing the order of attributes, and (2) other representative structures, such as knowledge graphs and JSON are helpful. We observe that utilizing tables offers the best balance between efficiency and effectiveness. The method remains robust to task complexity and adapts to unstructured sources through text-to-table conversion. Overall, we highlight the untapped potential of tabular representations for future LLM applications.
