HeLM: Highlighted Evidence augmented Language Model for Enhanced Table-to-Text Generation
Junyi Bian, Xiaolei Qin, Wuhe Zou, Mengzuo Huang, Congyi Luo, Ke Zhang, Weidong Zhang
TL;DR
HeLM introduces a lightweight, two-module approach for table-to-text generation that first highlights evidence rows in a table and then generates a tailored summary. By training a highlighter and a summarizer with parameter-efficient fine-tuning and a greedy evidence-label search, HeLM achieves state-of-the-art results on FeTaQA and QTSumm, while improving interpretability through explicit evidence highlighting. The key contributions include a practical evidence-label construction strategy, a feedback-based training loop, and empirical demonstrations of the benefits of input table highlighting for generation quality and explainability. The approach holds practical potential for cost-efficient, private table-to-text systems and provides a foundation for further improving evidence-driven reasoning in structured-data tasks.
Abstract
Large models have demonstrated significant progress across various domains, particularly in tasks related to text generation. In the domain of Table to Text, many Large Language Model (LLM)-based methods currently resort to modifying prompts to invoke public APIs, incurring potential costs and information leaks. With the advent of open-source large models, fine-tuning LLMs has become feasible. In this study, we conducted parameter-efficient fine-tuning on the LLaMA2 model. Distinguishing itself from previous fine-tuning-based table-to-text methods, our approach involves injecting reasoning information into the input by emphasizing table-specific row data. Our model consists of two modules: 1) a table reasoner that identifies relevant row evidence, and 2) a table summarizer that generates sentences based on the highlighted table. To facilitate this, we propose a search strategy to construct reasoning labels for training the table reasoner. On both the FetaQA and QTSumm datasets, our approach achieved state-of-the-art results. Additionally, we observed that highlighting input tables significantly enhances the model's performance and provides valuable interpretability.
