Table of Contents
Fetching ...

Synthetic Context Generation for Question Generation

Naiming Liu, Zichao Wang, Richard Baraniuk

TL;DR

This work targets automatic question generation (QG) under limited access to background context by generating synthetic contexts from QA pairs using LLM prompts. It introduces a two-stage pipeline: (1) synthesize context via LLM prompts from (q,a) pairs, and (2) fine-tune a smaller LM (Flan-T5-large) to generate questions conditioned on (c,a). Experiments on OS-Bio and SQuAD show that synthetic contexts are essential for QG, that fine-tuning small models can outperform prompting large LLMs, and that synthetic contexts can achieve near-parity with real contexts in QG quality. These findings enable scalable QG training in domain-specific settings where real-context data are scarce, advancing both the theory and practical deployment of QG systems.

Abstract

Despite rapid advancements in large language models (LLMs), QG remains a challenging problem due to its complicated process, open-ended nature, and the diverse settings in which question generation occurs. A common approach to address these challenges involves fine-tuning smaller, custom models using datasets containing background context, question, and answer. However, obtaining suitable domain-specific datasets with appropriate context is often more difficult than acquiring question-answer pairs. In this paper, we investigate training QG models using synthetic contexts generated by LLMs from readily available question-answer pairs. We conduct a comprehensive study to answer critical research questions related to the performance of models trained on synthetic contexts and their potential impact on QG research and applications. Our empirical results reveal: 1) contexts are essential for QG tasks, even if they are synthetic; 2) fine-tuning smaller language models has the capability of achieving better performances as compared to prompting larger language models; and 3) synthetic context and real context could achieve comparable performances. These findings highlight the effectiveness of synthetic contexts in QG and paves the way for future advancements in the field.

Synthetic Context Generation for Question Generation

TL;DR

This work targets automatic question generation (QG) under limited access to background context by generating synthetic contexts from QA pairs using LLM prompts. It introduces a two-stage pipeline: (1) synthesize context via LLM prompts from (q,a) pairs, and (2) fine-tune a smaller LM (Flan-T5-large) to generate questions conditioned on (c,a). Experiments on OS-Bio and SQuAD show that synthetic contexts are essential for QG, that fine-tuning small models can outperform prompting large LLMs, and that synthetic contexts can achieve near-parity with real contexts in QG quality. These findings enable scalable QG training in domain-specific settings where real-context data are scarce, advancing both the theory and practical deployment of QG systems.

Abstract

Despite rapid advancements in large language models (LLMs), QG remains a challenging problem due to its complicated process, open-ended nature, and the diverse settings in which question generation occurs. A common approach to address these challenges involves fine-tuning smaller, custom models using datasets containing background context, question, and answer. However, obtaining suitable domain-specific datasets with appropriate context is often more difficult than acquiring question-answer pairs. In this paper, we investigate training QG models using synthetic contexts generated by LLMs from readily available question-answer pairs. We conduct a comprehensive study to answer critical research questions related to the performance of models trained on synthetic contexts and their potential impact on QG research and applications. Our empirical results reveal: 1) contexts are essential for QG tasks, even if they are synthetic; 2) fine-tuning smaller language models has the capability of achieving better performances as compared to prompting larger language models; and 3) synthetic context and real context could achieve comparable performances. These findings highlight the effectiveness of synthetic contexts in QG and paves the way for future advancements in the field.
Paper Structure (22 sections, 3 figures, 5 tables)

This paper contains 22 sections, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Detailed Overview of Context Generation for Question Generation. We first prompt LLMs to generate synthetic context, then use the generated context and answer to fine-tune smaller LMs for question generation.
  • Figure 2: Word count and perplexity distribution for real and synthetic context generated with few-shot learning.
  • Figure 3: Performance of QG as the fraction of synthetic context increases. We present Meteor and Bleurt evaluation metric here.