Table of Contents
Fetching ...

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Zhenyu Li, Sunqi Fan, Yu Gu, Xiuxing Li, Zhichao Duan, Bowen Dong, Ning Liu, Jianyong Wang

TL;DR

FlexKBQA tackles data-scarce KBQA by combining LLMs as program translators with a lightweight KBQA model. It samples executable programs from the KB, translates them into natural language questions using small seed prompts, and trains a lightweight model with synthetic data. To bridge the gap to real user questions, it introduces execution-guided self-training and leverages the LLM's inherent reasoning as augmentation. Across GrailQA, WebQSP, and KQA Pro, FlexKBQA delivers strong few-shot and zero-shot performance, with significant gains from EGST and IR and competitive results versus supervised baselines, highlighting a practical path to scalable, domain-agnostic KBQA systems.

Abstract

Knowledge base question answering (KBQA) is a critical yet challenging task due to the vast number of entities within knowledge bases and the diversity of natural language questions posed by users. Unfortunately, the performance of most KBQA models tends to decline significantly in real-world scenarios where high-quality annotated data is insufficient. To mitigate the burden associated with manual annotation, we introduce FlexKBQA by utilizing Large Language Models (LLMs) as program translators for addressing the challenges inherent in the few-shot KBQA task. Specifically, FlexKBQA leverages automated algorithms to sample diverse programs, such as SPARQL queries, from the knowledge base, which are subsequently converted into natural language questions via LLMs. This synthetic dataset facilitates training a specialized lightweight model for the KB. Additionally, to reduce the barriers of distribution shift between synthetic data and real user questions, FlexKBQA introduces an executionguided self-training method to iterative leverage unlabeled user questions. Furthermore, we explore harnessing the inherent reasoning capability of LLMs to enhance the entire framework. Consequently, FlexKBQA delivers substantial flexibility, encompassing data annotation, deployment, and being domain agnostic. Through extensive experiments on GrailQA, WebQSP, and KQA Pro, we observe that under the few-shot even the more challenging zero-shot scenarios, FlexKBQA achieves impressive results with a few annotations, surpassing all previous baselines and even approaching the performance of supervised models, achieving a remarkable 93% performance relative to the fully-supervised models. We posit that FlexKBQA represents a significant advancement towards exploring better integration of large and lightweight models. The code is open-sourced.

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

TL;DR

FlexKBQA tackles data-scarce KBQA by combining LLMs as program translators with a lightweight KBQA model. It samples executable programs from the KB, translates them into natural language questions using small seed prompts, and trains a lightweight model with synthetic data. To bridge the gap to real user questions, it introduces execution-guided self-training and leverages the LLM's inherent reasoning as augmentation. Across GrailQA, WebQSP, and KQA Pro, FlexKBQA delivers strong few-shot and zero-shot performance, with significant gains from EGST and IR and competitive results versus supervised baselines, highlighting a practical path to scalable, domain-agnostic KBQA systems.

Abstract

Knowledge base question answering (KBQA) is a critical yet challenging task due to the vast number of entities within knowledge bases and the diversity of natural language questions posed by users. Unfortunately, the performance of most KBQA models tends to decline significantly in real-world scenarios where high-quality annotated data is insufficient. To mitigate the burden associated with manual annotation, we introduce FlexKBQA by utilizing Large Language Models (LLMs) as program translators for addressing the challenges inherent in the few-shot KBQA task. Specifically, FlexKBQA leverages automated algorithms to sample diverse programs, such as SPARQL queries, from the knowledge base, which are subsequently converted into natural language questions via LLMs. This synthetic dataset facilitates training a specialized lightweight model for the KB. Additionally, to reduce the barriers of distribution shift between synthetic data and real user questions, FlexKBQA introduces an executionguided self-training method to iterative leverage unlabeled user questions. Furthermore, we explore harnessing the inherent reasoning capability of LLMs to enhance the entire framework. Consequently, FlexKBQA delivers substantial flexibility, encompassing data annotation, deployment, and being domain agnostic. Through extensive experiments on GrailQA, WebQSP, and KQA Pro, we observe that under the few-shot even the more challenging zero-shot scenarios, FlexKBQA achieves impressive results with a few annotations, surpassing all previous baselines and even approaching the performance of supervised models, achieving a remarkable 93% performance relative to the fully-supervised models. We posit that FlexKBQA represents a significant advancement towards exploring better integration of large and lightweight models. The code is open-sourced.
Paper Structure (34 sections, 3 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 34 sections, 3 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: An analytical experiment on 500 random samples of GrailQA dev set (oracle entity linking). FlexKBQA's performance exhibits a consistently upward trend with the increasing synthetic data size, surpassing all in-context learning models with limited window length. Since our synthetic data are generated based on 25-shot real data, we also depict the performance of our underlying model (RnG-KBQA) trained by 25-shot real data as a baseline.
  • Figure 2: A comparison between FlexKBQA and prior methods. (a): Prior approaches enable LLMs to directly ground the question to the knowledge base through in-context learning capabilities. (b): An illustration of FlexKBQA's innovative design: (1) Automatic Program Sampling module generates diverse and executable programs. (2) Low-Resource Program Translation module synthesizes high-quality data pairs. (3) Execution-Guided Self-Training module addresses distribution shift. (4) Inherent Reasoning module boosts the pipeline by leveraging inherent knowledge within LLMs.
  • Figure 3: Results beyond few-shot setting. FlexKBQA consistently performs better with more annotated data.
  • Figure 4: Variation of model performances and the error rate of pseudo-labeled programs with EGST Iterations.
  • Figure 5: Model improvement in the PEGST process