Table of Contents
Fetching ...

Table-GPT: Table-tuned GPT for Diverse Table Tasks

Peng Li, Yeye He, Dror Yashar, Weiwei Cui, Song Ge, Haidong Zhang, Danielle Rifinski Fainman, Dongmei Zhang, Surajit Chaudhuri

TL;DR

Table-GPT introduces a table-tuning paradigm to overcome the limitations of pre-trained language models on two-dimensional table tasks. By synthesizing diverse table-tasks from real tables and applying multi-level data augmentations, the approach yields a Table-GPT that outperforms vanilla GPT-3.5 and ChatGPT on both seen and unseen table tasks, while preserving zero-shot and few-shot instruction-following. The experiments show widespread improvements (98/104 tests) and demonstrate the utility of Table-GPT as a table foundation model that can benefit downstream task-specific prompting and fine-tuning. The work positions table-tuning as a complementary and scalable alternative to prompt engineering for improving table understanding in large language models, with potential applications across data extraction, cleaning, and transformation tasks.

Abstract

Language models, such as GPT-3.5 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks. However, when probing language models using a range of basic table-understanding tasks, we observe that today's language models are still sub-optimal in many table-related tasks, likely because they are pre-trained predominantly on \emph{one-dimensional} natural-language texts, whereas relational tables are \emph{two-dimensional} objects. In this work, we propose a new "\emph{table-tuning}" paradigm, where we continue to train/fine-tune language models like GPT-3.5 and ChatGPT, using diverse table-tasks synthesized from real tables as training data, with the goal of enhancing language models' ability to understand tables and perform table tasks. We show that our resulting Table-GPT models demonstrate (1) better \emph{table-understanding} capabilities, by consistently outperforming the vanilla GPT-3.5 and ChatGPT, on a wide-range of table tasks, including holdout unseen tasks, and (2) strong \emph{generalizability}, in its ability to respond to diverse human instructions to perform new table-tasks, in a manner similar to GPT-3.5 and ChatGPT.

Table-GPT: Table-tuned GPT for Diverse Table Tasks

TL;DR

Table-GPT introduces a table-tuning paradigm to overcome the limitations of pre-trained language models on two-dimensional table tasks. By synthesizing diverse table-tasks from real tables and applying multi-level data augmentations, the approach yields a Table-GPT that outperforms vanilla GPT-3.5 and ChatGPT on both seen and unseen table tasks, while preserving zero-shot and few-shot instruction-following. The experiments show widespread improvements (98/104 tests) and demonstrate the utility of Table-GPT as a table foundation model that can benefit downstream task-specific prompting and fine-tuning. The work positions table-tuning as a complementary and scalable alternative to prompt engineering for improving table understanding in large language models, with potential applications across data extraction, cleaning, and transformation tasks.

Abstract

Language models, such as GPT-3.5 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks. However, when probing language models using a range of basic table-understanding tasks, we observe that today's language models are still sub-optimal in many table-related tasks, likely because they are pre-trained predominantly on \emph{one-dimensional} natural-language texts, whereas relational tables are \emph{two-dimensional} objects. In this work, we propose a new "\emph{table-tuning}" paradigm, where we continue to train/fine-tune language models like GPT-3.5 and ChatGPT, using diverse table-tasks synthesized from real tables as training data, with the goal of enhancing language models' ability to understand tables and perform table tasks. We show that our resulting Table-GPT models demonstrate (1) better \emph{table-understanding} capabilities, by consistently outperforming the vanilla GPT-3.5 and ChatGPT, on a wide-range of table tasks, including holdout unseen tasks, and (2) strong \emph{generalizability}, in its ability to respond to diverse human instructions to perform new table-tasks, in a manner similar to GPT-3.5 and ChatGPT.
Paper Structure (18 sections, 15 equations, 15 figures, 6 tables, 1 algorithm)

This paper contains 18 sections, 15 equations, 15 figures, 6 tables, 1 algorithm.

Figures (15)

  • Figure 1: Two simple tests to probe language-models' basic ability to read and understand tables. (Left) T-1: Missing cells identification, which is to identify the column-header/row-id of a missing cell. (Right) T-2: Column-Finding, which is to identify the column-name of a given value. Even large models (e.g. 175B GPT-3.5) can frequently fail on such tests, with only 0.26 accuracy in one variant of the tests.
  • Figure 2: Example table-tasks, where the ability of language models to "read" tables vertically is important. (Left) T-3: Table Question-Answering. (Right) T-8: Data Imputation. More tasks like these are shown in Table \ref{['tab:task-summary']}.
  • Figure 3: Instruction-tuning vs. Table-tuning. (Left) Instruction-tuning is a technique developed in the NLP community that continues to train language-models (e.g., GPT) for instruction-following capabilities (e.g., in ChatGPT). (Right) Table-tuning is an analogous approach we propose to train language-models to better understand table and perform table-tasks.
  • Figure 4: Table-models should ideally "generalize" to new datasets and new tasks. (Left) Column type annotation (CTA): while this is a common table-task, the list of target-types to choose from can vary from dataset to dataset (e.g., 78 types in cta-sherlock, and 107 in turl). Making table-models to "generalize" to new CTA dataset without needing to retrain, is useful. (Right) Text-to-Table: a general table-model should be as general-purpose as models like ChatGPT, in following instructions to perform novel unseen table-tasks, such as "extracting tables from text" in the example.
  • Figure 5: Instruction-tuning vs. Table-tuning. Instruction-tuning improves model "generalizability", to follow diverse human-instructions to perform new and unseen tasks (x-axis), whereas our proposed table-tuning is analogous in spirit but aims to improve model ability to understand tables and perform table-tasks (y-axis).
  • ...and 10 more figures