Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answering
Wonjin Lee, Kyumin Kim, Sungjae Lee, Jihun Lee, Kwang In Kim
TL;DR
PieTa addresses the difficulty of applying language models to 2D tables under token-length constraints by iteratively partitioning tables into windows, selecting relevant cells per window, and uniting them to form subtables. This multi-resolution divide-and-conquer approach preserves cross-row/column dependencies while avoiding long-context inputs. The authors introduce a coordinate-based subtable representation, fine-tune a dedicated selector, and demonstrate substantial improvements on WikiTQ and WikiSQL across multiple readers, with subtables averaging about 13.9% of the original size. The work provides a flexible, robust subtable-based QA paradigm that can be integrated with various readers and extended to SQL-based retrieval, offering practical gains for large, real-world tables.
Abstract
Applying language models (LMs) to tables is challenging due to the inherent structural differences between two-dimensional tables and one-dimensional text for which the LMs were originally designed. Furthermore, when applying linearized tables to LMs, the maximum token lengths often imposed in self-attention calculations make it difficult to comprehensively understand the context spread across large tables. To address these challenges, we present PieTa (Piece of Table), a new framework for subtable-based question answering (QA). PieTa operates through an iterative process of dividing tables into smaller windows, using LMs to select relevant cells within each window, and merging these cells into a subtable. This multi-resolution approach captures dependencies across multiple rows and columns while avoiding the limitations caused by long context inputs. Instantiated as a simple iterative subtable union algorithm, PieTa demonstrates improved performance over previous subtable-based QA approaches.
