Table of Contents
Fetching ...

KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

Sania Nayab, Marco Simoni, Giulio Rossolini, Andrea Saracino

TL;DR

KGQuest tackles scalable QA generation from knowledge graphs by combining a deterministic, template-driven pipeline with a lightweight, LLM-based refinement stage. Triplets are clustered by relation to produce reusable templates, which are instantiated with subject objects and augmented with KG-derived distractors; an optional per-template refinement with small LLMs improves fluency while preserving factual content. Evaluations across Wikigraphs, WebQSP, and CWQ show 80–90% correctness for templated questions, with refinement reducing linguistic errors and yielding substantial efficiency gains over direct, triplet-wide LLM generation. The approach offers a transparent, scalable framework for cross-domain KG QA generation with practical implications for education, benchmarking, and LLM evaluation, and points toward extensions like difficulty-aware distractors and broader domain generalization.

Abstract

The generation of questions and answers (QA) from knowledge graphs (KG) plays a crucial role in the development and testing of educational platforms, dissemination tools, and large language models (LLM). However, existing approaches often struggle with scalability, linguistic quality, and factual consistency. This paper presents a scalable and deterministic pipeline for generating natural language QA from KGs, with an additional refinement step using LLMs to further enhance linguistic quality. The approach first clusters KG triplets based on their relations, creating reusable templates through natural language rules derived from the entity types of objects and relations. A module then leverages LLMs to refine these templates, improving clarity and coherence while preserving factual accuracy. Finally, the instantiation of answer options is achieved through a selection strategy that introduces distractors from the KG. Our experiments demonstrate that this hybrid approach efficiently generates high-quality QA pairs, combining scalability with fluency and linguistic precision.

KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

TL;DR

KGQuest tackles scalable QA generation from knowledge graphs by combining a deterministic, template-driven pipeline with a lightweight, LLM-based refinement stage. Triplets are clustered by relation to produce reusable templates, which are instantiated with subject objects and augmented with KG-derived distractors; an optional per-template refinement with small LLMs improves fluency while preserving factual content. Evaluations across Wikigraphs, WebQSP, and CWQ show 80–90% correctness for templated questions, with refinement reducing linguistic errors and yielding substantial efficiency gains over direct, triplet-wide LLM generation. The approach offers a transparent, scalable framework for cross-domain KG QA generation with practical implications for education, benchmarking, and LLM evaluation, and points toward extensions like difficulty-aware distractors and broader domain generalization.

Abstract

The generation of questions and answers (QA) from knowledge graphs (KG) plays a crucial role in the development and testing of educational platforms, dissemination tools, and large language models (LLM). However, existing approaches often struggle with scalability, linguistic quality, and factual consistency. This paper presents a scalable and deterministic pipeline for generating natural language QA from KGs, with an additional refinement step using LLMs to further enhance linguistic quality. The approach first clusters KG triplets based on their relations, creating reusable templates through natural language rules derived from the entity types of objects and relations. A module then leverages LLMs to refine these templates, improving clarity and coherence while preserving factual accuracy. Finally, the instantiation of answer options is achieved through a selection strategy that introduces distractors from the KG. Our experiments demonstrate that this hybrid approach efficiently generates high-quality QA pairs, combining scalability with fluency and linguistic precision.

Paper Structure

This paper contains 19 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overview of the QA generation pipeline. The top part illustrates the steps applied to extract a template ($\bar{t}_k$) from triplets in the knowledge graph. This includes a clustering process, the use of deterministic sentence-construction rules, and an LLM-based refinement step. The bottom part shows the instantiation of a triplet-based question $\bar{q}$ from a refined template $\bar{t}_k$, along with the computation of distractors to define the full set of answer options.
  • Figure 2: Overview of the LLM-based Template Refinement Process. An LLM is applied to refine a given template $t_k$, correcting potential grammatical errors
  • Figure 3: Results by Error Type for Question instantiated from the template $t_k$ for the Wikigraphs, WebQSP, and CWQ KG datasets, respectively are shown in the subplots. The figures illustrate the distribution of error types (Grammar, Formatting, Syntax, Correct) for the LLaMA, Phi, and Qwen models, along with the final judgments.
  • Figure 4: Examples of generated triplets with LLama-70b from the selected KG, along with the corresponding questions instantiated from templates, first extracted through the deterministic step and then refined. For comparison, questions generated directly by the LLM are also provided in grey, with potential hallucination issues highlighted in bold.