ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context
Zirui Wu, Yansong Feng
TL;DR
ProTrix introduces Plan-then-Reason, a framework that first plans reasoning pathways over tabular data with sentence context and then assigns each step to either textual or program-based reasoning to produce answers. The method leverages an instruction-tuning dataset, TrixInstruct, to train models (ProTrix and ProTrix-Coder) that generalize to unseen tabular tasks with only around 6k training examples and can generate faithful explanations. In-context learning and finetuning experiments show strong generalization across diverse benchmarks (WikiTQ, FEVEROUS, WikiSQL, TabFact, HybridQA, SciTAB, etc.) with improved efficiency (fewer prompts and API calls) and robust performance without self-consistency. The work highlights planning and modular reasoning as key capabilities for generalist tabular QA systems and provides open-source datasets and models to foster future research, while noting limitations such as handling only single relational tables and the need for enhanced multi-table and hierarchical table support.
Abstract
Tables play a crucial role in conveying information in various domains. We propose a Plan-then-Reason framework to answer different types of user queries over tables with sentence context. The framework first plans the reasoning paths over the context, then assigns each step to program-based or textual reasoning to reach the final answer. This framework enhances the table reasoning abilities for both in-context learning and fine-tuning methods. GPT-3.5-Turbo following Plan-then-Reason framework surpasses other prompting baselines without self-consistency while using less API calls and in-context demonstrations. We also construct an instruction tuning set TrixInstruct to evaluate the effectiveness of fine-tuning with this framework. We present ProTrix model family by finetuning models on TrixInstruct. Our experiments show that ProTrix family generalizes to diverse unseen tabular tasks with only 6k training instances. We further demonstrate that ProTrix can generate accurate and faithful explanations to answer complex free-form questions. Our work underscores the importance of the planning and reasoning abilities towards a model over tabular tasks with generalizability and interpretability. We open-source our dataset and models at https://github.com/WilliamZR/ProTrix.
