Table of Contents
Fetching ...

An In-Context Schema Understanding Method for Knowledge Base Question Answering

Yantao Liu, Zixuan Li, Xiaolong Jin, Yucan Guo, Long Bai, Saiping Guan, Jiafeng Guo, Xueqi Cheng

TL;DR

This paper tackles KBQA under heterogeneous and enormous KB schemas by introducing In-Context Schema Understanding (ICSU), which uses in-context learning with schema-related annotated examples to prompt LLMs to directly generate SPARQL queries. ICSU explores four example retrieval strategies (Raw, Anonymized, SPARQL, Hybrid) to construct prompts and leverages a defined prompt structure to elicit accurate SPARQL from various LLMs. Experimental results on KQA Pro and WebQSP show that ICSU achieves competitive performance against established baselines, with recall rate of schema elements in retrieved examples being a key determinant of success. The work demonstrates that LLMs can leverage in-context schema information to perform complex semantic parsing in KBQA, reducing reliance on multi-stage pipelines, though it assumes access to linked entities and acknowledges potential schema leakage considerations.

Abstract

The Knowledge Base Question Answering (KBQA) task aims to answer natural language questions based on a given knowledge base. Recently, Large Language Models (LLMs) have shown strong capabilities in language understanding and can be used to solve this task. In doing so, a major challenge for LLMs is to overcome the immensity and heterogeneity of knowledge base schemas.Existing methods bypass this challenge by initially employing LLMs to generate drafts of logic forms without schema-specific details.Then, an extra module is used to inject schema information to these drafts.In contrast, in this paper, we propose a simple In-Context Schema Understanding (ICSU) method that enables LLMs to directly understand schemas by leveraging in-context learning. Specifically, ICSU provides schema information to LLMs using schema-related annotated examples. We investigate three example retrieval strategies based on raw questions, anonymized questions, and generated SPARQL queries. Experimental results show that ICSU demonstrates competitive performance compared to baseline methods on both the KQA Pro and WebQSP datasets.

An In-Context Schema Understanding Method for Knowledge Base Question Answering

TL;DR

This paper tackles KBQA under heterogeneous and enormous KB schemas by introducing In-Context Schema Understanding (ICSU), which uses in-context learning with schema-related annotated examples to prompt LLMs to directly generate SPARQL queries. ICSU explores four example retrieval strategies (Raw, Anonymized, SPARQL, Hybrid) to construct prompts and leverages a defined prompt structure to elicit accurate SPARQL from various LLMs. Experimental results on KQA Pro and WebQSP show that ICSU achieves competitive performance against established baselines, with recall rate of schema elements in retrieved examples being a key determinant of success. The work demonstrates that LLMs can leverage in-context schema information to perform complex semantic parsing in KBQA, reducing reliance on multi-stage pipelines, though it assumes access to linked entities and acknowledges potential schema leakage considerations.

Abstract

The Knowledge Base Question Answering (KBQA) task aims to answer natural language questions based on a given knowledge base. Recently, Large Language Models (LLMs) have shown strong capabilities in language understanding and can be used to solve this task. In doing so, a major challenge for LLMs is to overcome the immensity and heterogeneity of knowledge base schemas.Existing methods bypass this challenge by initially employing LLMs to generate drafts of logic forms without schema-specific details.Then, an extra module is used to inject schema information to these drafts.In contrast, in this paper, we propose a simple In-Context Schema Understanding (ICSU) method that enables LLMs to directly understand schemas by leveraging in-context learning. Specifically, ICSU provides schema information to LLMs using schema-related annotated examples. We investigate three example retrieval strategies based on raw questions, anonymized questions, and generated SPARQL queries. Experimental results show that ICSU demonstrates competitive performance compared to baseline methods on both the KQA Pro and WebQSP datasets.
Paper Structure (22 sections, 3 figures, 9 tables)

This paper contains 22 sections, 3 figures, 9 tables.

Figures (3)

  • Figure 1: LLMs fail to generate a correct SPARQL query when lacking schema information.
  • Figure 2: The pipeline of generating SPARQL queries with ICSU when example number $k$ = 3
  • Figure 3: The correlation between relation recall rate and accuracy on KQA Pro.