HeGTa: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding
Rihui Jin, Yu Li, Guilin Qi, Nan Hu, Yuan-Fang Li, Jiaoyan Chen, Jianan Wang, Yongrui Chen, Dehai Min, Sheng Bi
TL;DR
HeGTa tackles the problem of few-shot complex table understanding by fusing a tabular heterogeneous graph encoder with a large language model. It constructs a Tabular HG to preserve topological semantics, aligns the HG encoder with the LLM through soft prompts, and pre-trains with three multi-granularity self-supervised tasks (TRC, TCM, TCG) before task-specific fine-tuning. Empirical results across nine TU datasets for CTC, TTC, and TQA show HeGTa achieving state-of-the-art performance in few-shot settings, with ablations confirming the value of each component and the benefits of heterogeneous graph representations over homogeneous or linearized approaches. The approach demonstrates strong generalization and practical potential for real-world TU tasks, particularly those with complex table structures and limited annotations.
Abstract
Table understanding (TU) has achieved promising advancements, but it faces the challenges of the scarcity of manually labeled tables and the presence of complex table structures.To address these challenges, we propose HGT, a framework with a heterogeneous graph (HG)-enhanced large language model (LLM) to tackle few-shot TU tasks.It leverages the LLM by aligning the table semantics with the LLM's parametric knowledge through soft prompts and instruction turning and deals with complex tables by a multi-task pre-training scheme involving three novel multi-granularity self-supervised HG pre-training objectives.We empirically demonstrate the effectiveness of HGT, showing that it outperforms the SOTA for few-shot complex TU on several benchmarks.
