OpenTab: Advancing Large Language Models as Open-domain Table Reasoners
Kezhi Kong, Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Chuan Lei, Christos Faloutsos, Huzefa Rangwala, George Karypis
TL;DR
OpenTab tackles open-domain table reasoning by grounding LLM outputs in retrieved tables through a BM25-based retriever and a non-fine-tuned reasoning pipeline. The core idea decomposes reasoning into a Coder that generates SQL, a RowSelector that curates evidence rows, and a Reader that produces the final answer, augmented by a Generative Reranking & Sequential Reasoning (GRSR) strategy to mitigate hallucinations. Experiments on Open-WikiTables, WikiTableQuestions, and FEVEROUS show OpenTab outperforms baselines in open-domain and closed-domain settings, with up to 21.5% accuracy gains. The work demonstrates robust, scalable grounding for tabular data and provides ablations confirming the value of simple-to-complex SQL generation and the GRSR strategy.
Abstract
Large Language Models (LLMs) trained on large volumes of data excel at various natural language tasks, but they cannot handle tasks requiring knowledge that has not been trained on previously. One solution is to use a retriever that fetches relevant information to expand LLM's knowledge scope. However, existing textual-oriented retrieval-based LLMs are not ideal on structured table data due to diversified data modalities and large table sizes. In this work, we propose OpenTab, an open-domain table reasoning framework powered by LLMs. Overall, OpenTab leverages table retriever to fetch relevant tables and then generates SQL programs to parse the retrieved tables efficiently. Utilizing the intermediate data derived from the SQL executions, it conducts grounded inference to produce accurate response. Extensive experimental evaluation shows that OpenTab significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy. We further run ablation studies to validate the efficacy of our proposed designs of the system.
