Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs
Ananya Singha, José Cambronero, Sumit Gulwani, Vu Le, Chris Parnin
TL;DR
The paper systematically analyzes how tabular representation formats and eight real-world-inspired noise operations affect LLMs' ability to perform self-supervised table-structure tasks via in-context learning. It introduces a broad evaluation framework across eight formats and eight noise types, assessing both fact-finding and transformation tasks on seven Kaggle datasets using GPT-3. Key findings show that DFLoader and JSON formats typically yield the best performance for most tasks, while noise can both improve and degrade results depending on the task- format combination. The work highlights format brittleness in LLMs and motivates further multi-LLM studies and investigations into how table-structure robustness translates to downstream table tasks. Overall, the paper provides practical guidance for prompt design and data-preparation when working with tabular data in LLM applications.
Abstract
Large language models (LLMs) are increasingly applied for tabular tasks using in-context learning. The prompt representation for a table may play a role in the LLMs ability to process the table. Inspired by prior work, we generate a collection of self-supervised structural tasks (e.g. navigate to a cell and row; transpose the table) and evaluate the performance differences when using 8 formats. In contrast to past work, we introduce 8 noise operations inspired by real-world messy data and adversarial inputs, and show that such operations can impact LLM performance across formats for different structural understanding tasks.
