Table of Contents
Fetching ...

Talking with Tables for Better LLM Factual Data Interactions

Jio Oh, Geon Heo, Seungjun Oh, Hyunjin Kim, JinYeong Bak, Jindong Wang, Xing Xie, Steven Euijong Whang

TL;DR

The paper investigates how data structuring affects LLM factual data interactions, showing that tabular representations yield superior accuracy, robustness, and token efficiency compared with natural text, JSON, and knowledge graphs. It introduces a comprehensive Evaluation Framework comparing multiple formats across three interaction scenarios and six task types, with extensive ablations demonstrating the conditions under which tables help most. Mechanistic analyses reveal that tabular inputs steer model attention toward relevant attributes and schemas, explaining performance gains, especially in sparse data settings. The findings suggest that Thinking with Tables is a practical, scalable approach for real-world data analytics with LLMs, and point to extensions toward partial structuring and integration with non-tabular data sources for broader applicability.

Abstract

Large Language Models (LLMs) often struggle with requests related to information retrieval and data manipulation that frequently arise in real-world scenarios under multiple conditions. In this paper, we demonstrate that leveraging tabular structures in LLM interactions, is more effective than utilizing other structures for handling prevalent requests that operate over factual data. Through comprehensive evaluations across various scenarios and request types, we show that providing tabular structures yields a 40.29\% average performance gain along with better robustness and token efficiency. Through attention-value analysis, we discover that tables help LLMs better locate relevant information, explaining these improvements. Beyond tables and text, we evaluate whether (1) blending structuredness within text, such as providing templates or fixing the order of attributes, and (2) other representative structures, such as knowledge graphs and JSON are helpful. We observe that utilizing tables offers the best balance between efficiency and effectiveness. The method remains robust to task complexity and adapts to unstructured sources through text-to-table conversion. Overall, we highlight the untapped potential of tabular representations for future LLM applications.

Talking with Tables for Better LLM Factual Data Interactions

TL;DR

The paper investigates how data structuring affects LLM factual data interactions, showing that tabular representations yield superior accuracy, robustness, and token efficiency compared with natural text, JSON, and knowledge graphs. It introduces a comprehensive Evaluation Framework comparing multiple formats across three interaction scenarios and six task types, with extensive ablations demonstrating the conditions under which tables help most. Mechanistic analyses reveal that tabular inputs steer model attention toward relevant attributes and schemas, explaining performance gains, especially in sparse data settings. The findings suggest that Thinking with Tables is a practical, scalable approach for real-world data analytics with LLMs, and point to extensions toward partial structuring and integration with non-tabular data sources for broader applicability.

Abstract

Large Language Models (LLMs) often struggle with requests related to information retrieval and data manipulation that frequently arise in real-world scenarios under multiple conditions. In this paper, we demonstrate that leveraging tabular structures in LLM interactions, is more effective than utilizing other structures for handling prevalent requests that operate over factual data. Through comprehensive evaluations across various scenarios and request types, we show that providing tabular structures yields a 40.29\% average performance gain along with better robustness and token efficiency. Through attention-value analysis, we discover that tables help LLMs better locate relevant information, explaining these improvements. Beyond tables and text, we evaluate whether (1) blending structuredness within text, such as providing templates or fixing the order of attributes, and (2) other representative structures, such as knowledge graphs and JSON are helpful. We observe that utilizing tables offers the best balance between efficiency and effectiveness. The method remains robust to task complexity and adapts to unstructured sources through text-to-table conversion. Overall, we highlight the untapped potential of tabular representations for future LLM applications.

Paper Structure

This paper contains 54 sections, 4 equations, 9 figures, 16 tables.

Figures (9)

  • Figure 1: Comparison of different structures in terms of LLM performance and the number of tokens. Although other formats include additional information about the context or relationship between attributes, the Table format achieves the highest performance, indicating token efficiency and effectiveness. Textual formats are divided into three different structuring levels (Green). The examples of each structure are in Tbl. \ref{['tab: structuring level']} and \ref{['tab: semi-structure_representation']}, and the extensive results are in Sec. \ref{['sec: result']}.
  • Figure 2: Evaluation Framework. We design to evaluate the impact of structures on LLMs’ performance and robustness for requests that operate over factual data, with analyses and ablation studies across various models, examining the generalizability of Talking with Tables. We compare LLM performances with different structural representation of the same information in scenarios shown in Fig. \ref{['fig:scenario']}.
  • Figure 3: To evaluate the effectiveness of structuredness, we define and categorize three types of conversation scenarios: single-turn, multi-turn, and pre-instruction. A request $Req$ consists of a main request and additional conditions (underline). White and black speech bubbles denote user requests and model responses, respectively.
  • Figure 4: Variance of each model's performance across different instruction templates. Using a tabular structure leads to more consistent performance.
  • Figure 5: Number of input tokens (left) and F1 scores (right) of GPT-4o for Retrieval request under varying sparsity levels for different structures (Text, JSON, KG, KG_shuffled, and Table). Zero sparsity indicates a dense data scenario and lower token counts imply greater token efficiency. We observe that injecting semi-structured formats degrades model performance, while tables remain the most effective and token efficient.
  • ...and 4 more figures