Table of Contents
Fetching ...

Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering

Longquan Jiang, Junbo Huang, Cedric Möller, Ricardo Usbeck

TL;DR

This work tackles the generalization problem of knowledge graph question answering across heterogeneous KGs by introducing OntoSCPrompt, a two-stage, ontology-guided framework. It first forecasts a KG-agnostic SPARQL structure (Stage-S) and then fills it with KG-specific content (Stage-C), enhanced by task-specific textual prompts and learnable continuous vectors that encode ontology and question features. The authors introduce constrained decoding strategies (grammar, structure, and subgraph-guided) to ensure the generated SPARQL is valid and executable, and demonstrate competitive or state-of-the-art performance across multiple KGQA benchmarks, including cross-KG evaluation without retraining. Ablation studies and pre-training results further reveal the value of ontology guidance for generalization to unseen KGs, suggesting practical, resource-efficient deployment in diverse KG environments.

Abstract

Most existing Knowledge Graph Question Answering (KGQA) approaches are designed for a specific KG, such as Wikidata, DBpedia or Freebase. Due to the heterogeneity of the underlying graph schema, topology and assertions, most KGQA systems cannot be transferred to unseen Knowledge Graphs (KGs) without resource-intensive training data. We present OntoSCPrompt, a novel Large Language Model (LLM)-based KGQA approach with a two-stage architecture that separates semantic parsing from KG-dependent interactions. OntoSCPrompt first generates a SPARQL query structure (including SPARQL keywords such as SELECT, ASK, WHERE and placeholders for missing tokens) and then fills them with KG-specific information. To enhance the understanding of the underlying KG, we present an ontology-guided, hybrid prompt learning strategy that integrates KG ontology into the learning process of hybrid prompts (e.g., discrete and continuous vectors). We also present several task-specific decoding strategies to ensure the correctness and executability of generated SPARQL queries in both stages. Experimental results demonstrate that OntoSCPrompt performs as well as SOTA approaches without retraining on a number of KGQA datasets such as CWQ, WebQSP and LC-QuAD 1.0 in a resource-efficient manner and can generalize well to unseen domain-specific KGs like DBLP-QuAD and CoyPu KG Code: \href{https://github.com/LongquanJiang/OntoSCPrompt}{https://github.com/LongquanJiang/OntoSCPrompt}

Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering

TL;DR

This work tackles the generalization problem of knowledge graph question answering across heterogeneous KGs by introducing OntoSCPrompt, a two-stage, ontology-guided framework. It first forecasts a KG-agnostic SPARQL structure (Stage-S) and then fills it with KG-specific content (Stage-C), enhanced by task-specific textual prompts and learnable continuous vectors that encode ontology and question features. The authors introduce constrained decoding strategies (grammar, structure, and subgraph-guided) to ensure the generated SPARQL is valid and executable, and demonstrate competitive or state-of-the-art performance across multiple KGQA benchmarks, including cross-KG evaluation without retraining. Ablation studies and pre-training results further reveal the value of ontology guidance for generalization to unseen KGs, suggesting practical, resource-efficient deployment in diverse KG environments.

Abstract

Most existing Knowledge Graph Question Answering (KGQA) approaches are designed for a specific KG, such as Wikidata, DBpedia or Freebase. Due to the heterogeneity of the underlying graph schema, topology and assertions, most KGQA systems cannot be transferred to unseen Knowledge Graphs (KGs) without resource-intensive training data. We present OntoSCPrompt, a novel Large Language Model (LLM)-based KGQA approach with a two-stage architecture that separates semantic parsing from KG-dependent interactions. OntoSCPrompt first generates a SPARQL query structure (including SPARQL keywords such as SELECT, ASK, WHERE and placeholders for missing tokens) and then fills them with KG-specific information. To enhance the understanding of the underlying KG, we present an ontology-guided, hybrid prompt learning strategy that integrates KG ontology into the learning process of hybrid prompts (e.g., discrete and continuous vectors). We also present several task-specific decoding strategies to ensure the correctness and executability of generated SPARQL queries in both stages. Experimental results demonstrate that OntoSCPrompt performs as well as SOTA approaches without retraining on a number of KGQA datasets such as CWQ, WebQSP and LC-QuAD 1.0 in a resource-efficient manner and can generalize well to unseen domain-specific KGs like DBLP-QuAD and CoyPu KG Code: \href{https://github.com/LongquanJiang/OntoSCPrompt}{https://github.com/LongquanJiang/OntoSCPrompt}

Paper Structure

This paper contains 22 sections, 2 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Three ontology examples representing the same world facts about Apple Inc., Steve Jobs and Steve Wozniak in Freebase, DBpedia and Wikidata. Similar knowledge can be modelled differently regarding assertions (i.e., persistent entity identifiers), schema and topology, requiring different translations from the same natural language question to a SPARQL query.
  • Figure 2: The performance of different decoding strategies on WebQSP under various beam sizes.