Effective Distillation of Table-based Reasoning Ability from LLMs

Bohao Yang; Chen Tang; Kun Zhao; Chenghao Xiao; Chenghua Lin

Effective Distillation of Table-based Reasoning Ability from LLMs

Bohao Yang, Chen Tang, Kun Zhao, Chenghao Xiao, Chenghua Lin

TL;DR

The paper addresses the high compute cost of LLMs by distilling their table-based reasoning into smaller, task-tailored models for scientific table-to-text generation. It proposes a two-stage pipeline: (i) generate table-based CoT data from a large teacher LLM using one-shot CoT with Self-Refine filtering, and (ii) fine-tune small models on the distilled data to transfer reasoning ability, optimizing $P(Y\mid T,R)$. Experiments on SciGen show that a 220M-parameter Flan-T5-base fine-tuned with distilled CoT data can outperform certain LLM baselines on specific metrics and achieve strong faithfulness (TAPAS-Acc and TAPEX-Acc) scores, approaching teacher performance. The results demonstrate practical benefits for deploying table reasoning in resource-constrained settings, reducing model size and data requirements while maintaining high-quality, factually grounded table-to-text descriptions of scientific data.

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their enormous parameter size and extremely high requirements for compute power pose challenges for their practical deployment. Recent research has revealed that specific capabilities of LLMs, such as numerical reasoning, can be transferred to smaller models through distillation. Some studies explore the potential of leveraging LLMs to perform table-based reasoning. However, there has been no prior work focusing on table reasoning skills in smaller models specifically tailored for scientific table-to-text generation tasks. In this paper, we propose a novel table-based reasoning distillation approach, with the aim of distilling LLMs into tailored smaller models. Our experimental results have shown that a 220 million parameter model (Flan-T5-base) fine-tuned using distilled data, not only achieves a significant improvement compared to traditionally fine-tuned baselines, but also surpasses specific LLMs on a scientific table-to-text generation dataset. Our code is available at https://github.com/Bernard-Yang/DistillTableCoT.

Effective Distillation of Table-based Reasoning Ability from LLMs

TL;DR

. Experiments on SciGen show that a 220M-parameter Flan-T5-base fine-tuned with distilled CoT data can outperform certain LLM baselines on specific metrics and achieve strong faithfulness (TAPAS-Acc and TAPEX-Acc) scores, approaching teacher performance. The results demonstrate practical benefits for deploying table reasoning in resource-constrained settings, reducing model size and data requirements while maintaining high-quality, factually grounded table-to-text descriptions of scientific data.

Abstract

Paper Structure (21 sections, 3 equations, 7 figures, 3 tables)

This paper contains 21 sections, 3 equations, 7 figures, 3 tables.

Introduction
Related Work
Table-based Reasoning
Chain-of-thought Reasoning
Knowledge Distillation
Methodology
Task Definition
Table-based Reasoning Generation
Fine-tuning Small Models
Experiments
Dataset
Baselines
Experimental Settings
Automatic Evaluation Metric
Results
...and 6 more sections

Figures (7)

Figure 1: The overview of the distillation pipeline and example data. The pipeline includes using LLMs to generate table-based reasoning and descriptions given the input table.
Figure 2: The overview of our framework. For synthesising data from LLMs, we provide table examples to LLMs, and use it to generate reasonings and descriptions. Then, the generated descriptions are verified by LLMs and the false reasoning and description pairs are removed. For fine-tuning smaller models, we fine-tune small models with generated reasoning and description, which inject the reasoning ability into smaller models.
Figure 3: Sample table from nam2019surf with its corresponding input representation. The reasoning and description are generated from LLMs for further fine-tuning smaller models.
Figure 4: Ablation study of smaller models on the SciGen dataset. Compared with models using standard fine-tuning, T5 and Flan-T5 fine-tuned with CoT data achieve significant improvements on both TAPAS-Acc and TAPEX-Acc.
Figure 5: The TAPAS-Acc of the teacher models (LLMs) and small models on the SciGen dataset. All the small models fine-tuned with CoT data can surpass LLMs with direct prompting.
...and 2 more figures

Effective Distillation of Table-based Reasoning Ability from LLMs

TL;DR

Abstract

Effective Distillation of Table-based Reasoning Ability from LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (7)