Table-GPT: Table-tuned GPT for Diverse Table Tasks
Peng Li, Yeye He, Dror Yashar, Weiwei Cui, Song Ge, Haidong Zhang, Danielle Rifinski Fainman, Dongmei Zhang, Surajit Chaudhuri
TL;DR
Table-GPT introduces a table-tuning paradigm to overcome the limitations of pre-trained language models on two-dimensional table tasks. By synthesizing diverse table-tasks from real tables and applying multi-level data augmentations, the approach yields a Table-GPT that outperforms vanilla GPT-3.5 and ChatGPT on both seen and unseen table tasks, while preserving zero-shot and few-shot instruction-following. The experiments show widespread improvements (98/104 tests) and demonstrate the utility of Table-GPT as a table foundation model that can benefit downstream task-specific prompting and fine-tuning. The work positions table-tuning as a complementary and scalable alternative to prompt engineering for improving table understanding in large language models, with potential applications across data extraction, cleaning, and transformation tasks.
Abstract
Language models, such as GPT-3.5 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks. However, when probing language models using a range of basic table-understanding tasks, we observe that today's language models are still sub-optimal in many table-related tasks, likely because they are pre-trained predominantly on \emph{one-dimensional} natural-language texts, whereas relational tables are \emph{two-dimensional} objects. In this work, we propose a new "\emph{table-tuning}" paradigm, where we continue to train/fine-tune language models like GPT-3.5 and ChatGPT, using diverse table-tasks synthesized from real tables as training data, with the goal of enhancing language models' ability to understand tables and perform table tasks. We show that our resulting Table-GPT models demonstrate (1) better \emph{table-understanding} capabilities, by consistently outperforming the vanilla GPT-3.5 and ChatGPT, on a wide-range of table tasks, including holdout unseen tasks, and (2) strong \emph{generalizability}, in its ability to respond to diverse human instructions to perform new table-tasks, in a manner similar to GPT-3.5 and ChatGPT.
