ALTER: Augmentation for Large-Table-Based Reasoning
Han Zhang, Yuheng Ma, Hanfang Yang
TL;DR
The paper addresses the challenge of scaling large-table reasoning with LLMs by introducing ALTER, a framework that decouples data access from reasoning through an augment-filter-execution pipeline. It leverages two augmentation streams—the Query Augmentor and the Table Augmentor—along with an SQL-based Table Organizer and a Joint Reasoner to operate on a small, augmented view of the table ($K$ observed rows) while leveraging rich schema, semantic, and literal information. Extensive experiments on WikiTQ and TabFact demonstrate that ALTER achieves state-of-the-art or near-state-of-the-art performance, with particular strength in large-table scenarios and robustness to noise and table size increases. The framework offers practical impact for real-world table reasoning by reducing data leakage, noise, and computation while preserving accuracy through structured augmentation and selective execution.
Abstract
While extensive research has explored the use of large language models (LLMs) for table-based reasoning, most approaches struggle with scalability when applied to large tables. To maintain the superior comprehension abilities of LLMs in these scenarios, we introduce ALTER(Augmentation for Large-Table-Based Reasoning)-a framework designed to harness the latent augmentation potential in both free-form natural language (NL) questions, via the query augmentor, and semi-structured tabular data, through the table augmentor. By utilizing only a small subset of relevant data from the table and supplementing it with pre-augmented schema, semantic, and literal information, ALTER achieves outstanding performance on table-based reasoning benchmarks. We also provide a detailed analysis of large-table scenarios, comparing different methods and various partitioning principles. In these scenarios, our method outperforms all other approaches and exhibits robustness and efficiency against perturbations.
