Table of Contents
Fetching ...

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister

TL;DR

This work introduces Chain-of-Table, a reasoning framework that carries out multi-step tabular reasoning by evolving a table through a sequence of atomic operations. At each step, a large language model dynamically plans the next operation, generates its arguments, and applies it to transform the table, with intermediate tables acting as structured proxies for its thoughts. The chain ends with a final query that leverages the evolved table to produce the answer, enabling more accurate and reliable table understanding across tasks like table-based QA and fact verification. Empirical results on WikiTQ, TabFact, and FeTaQA show state-of-the-art performance across multiple backbones, with analyses highlighting efficiency gains, robustness to table size, and interpretable intermediate reasoning steps.

Abstract

Table-based reasoning with large language models (LLMs) is a promising direction to tackle many table understanding tasks, such as table-based question answering and fact verification. Compared with generic reasoning, table-based reasoning requires the extraction of underlying semantics from both free-form questions and semi-structured tabular data. Chain-of-Thought and its similar approaches incorporate the reasoning chain in the form of textual context, but it is still an open question how to effectively leverage tabular data in the reasoning chain. We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Specifically, we guide LLMs using in-context learning to iteratively generate operations and update the table to represent a tabular reasoning chain. LLMs can therefore dynamically plan the next operation based on the results of the previous ones. This continuous evolution of the table forms a chain, showing the reasoning process for a given tabular problem. The chain carries structured information of the intermediate results, enabling more accurate and reliable predictions. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks across multiple LLM choices.

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

TL;DR

This work introduces Chain-of-Table, a reasoning framework that carries out multi-step tabular reasoning by evolving a table through a sequence of atomic operations. At each step, a large language model dynamically plans the next operation, generates its arguments, and applies it to transform the table, with intermediate tables acting as structured proxies for its thoughts. The chain ends with a final query that leverages the evolved table to produce the answer, enabling more accurate and reliable table understanding across tasks like table-based QA and fact verification. Empirical results on WikiTQ, TabFact, and FeTaQA show state-of-the-art performance across multiple backbones, with analyses highlighting efficiency gains, robustness to table size, and interpretable intermediate reasoning steps.

Abstract

Table-based reasoning with large language models (LLMs) is a promising direction to tackle many table understanding tasks, such as table-based question answering and fact verification. Compared with generic reasoning, table-based reasoning requires the extraction of underlying semantics from both free-form questions and semi-structured tabular data. Chain-of-Thought and its similar approaches incorporate the reasoning chain in the form of textual context, but it is still an open question how to effectively leverage tabular data in the reasoning chain. We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Specifically, we guide LLMs using in-context learning to iteratively generate operations and update the table to represent a tabular reasoning chain. LLMs can therefore dynamically plan the next operation based on the results of the previous ones. This continuous evolution of the table forms a chain, showing the reasoning process for a given tabular problem. The chain carries structured information of the intermediate results, enabling more accurate and reliable predictions. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks across multiple LLM choices.
Paper Structure (32 sections, 18 figures, 10 tables, 1 algorithm)

This paper contains 32 sections, 18 figures, 10 tables, 1 algorithm.

Figures (18)

  • Figure 1: Illustration of the comparison between (a) generic reasoning, (b) program-aided reasoning, and (c) the proposed Chain-of-Table. Given a complex table where a cyclist's nationality and name are in the same cell, (a) is unable to provide the correct answer through multi-step reasoning due to the complexity; (b) generates and executes programs (e.g. SQL queries) to deliver the answer, but it also falls short in accurately parsing the name and nationality in the table. In contrast, (c) Chain-of-Table iteratively samples a chain of operations that effectively transform the complex table into a version specifically tailored to the question. With the assistance of Chain-of-Table, the LLM can arrive at the correct answer.
  • Figure 2: Illustration of DynamicPlan($T$,$Q$,chain) and GenerateArgs($T$,$Q$,f) in the proposed Chain-of-Table, where $T$ is a intermediate table; $Q$ is the question; chain is a list of operations already performed on the table; f is the operation selected by DynamicPlan. Left:DynamicPlan samples the next operation from the operation pool, according to ($T$, chain, $Q$). Right:GenerateArgs takes the selected operation f as input and generates its arguments based on ($T$, f, $Q$). The operations, along with their arguments, act as a proxy of the tabular reasoning process to effectively tackle table understanding tasks.
  • Figure 3: Performance of Chain-of-Thought, Dater, and the proposed Chain-of-Table on WikiTQ for questions that require an operation chain of varying lengths. Our proposed atomic operations allow our proposed method Chain-of-Table to dynamically transform the input table through multiple reasoning iterations. This significantly improves performance over generic and program-aided reasoning counterparts.
  • Figure 4: Illustration of the tabular reasoning process in Chain-of-Table. This iterative process involves dynamically planning an operation chain and accurately storing intermediate results in the transformed tables. These intermediate tables serve as tabular thought process that can guide the LLM to land to the correct answer more reliably.
  • Figure 5: Result example of Chain-of-Table on FeTaQA using the ROUGE scores as metrics, where the ROUGE metrics assign very low scores but the generated answers were correct.
  • ...and 13 more figures