Table of Contents
Fetching ...

Generating Reading Comprehension Exercises with Large Language Models for Educational Applications

Xingyu Huang, Fei Jiang, Jianli Xiao

TL;DR

This paper tackles automatic generation of reading comprehension exercises using large language models. It presents the Reading Comprehension Exercise Generation (RCEG) framework, which combines instruction-tuned fine-tuning, reward modeling, and PPO optimization with postgeneration control via Dynamic Attribute Graphs (DATG) and GeDi-based filtering to ensure pedagogy, safety, and quality. A trio of task-specific datasets supports the training pipeline, and a Gradio-based interactive interface enables practical use by learners and educators. Empirical results show improvements in text similarity, reasoning alignment, fluency, and safety over baselines, highlighting RCEG's potential for scalable, learner-adaptive educational content generation.

Abstract

With the rapid development of large language models (LLMs), the applications of LLMs have grown substantially. In the education domain, LLMs demonstrate significant potential, particularly in automatic text generation, which enables the creation of intelligent and adaptive learning content. This paper proposes a new LLMs framework, which is named as Reading Comprehension Exercise Generation (RCEG). It can generate high-quality and personalized English reading comprehension exercises automatically. Firstly, RCEG uses fine-tuned LLMs to generate content candidates. Then, it uses a discriminator to select the best candidate. Finally, the quality of the generated content has been improved greatly. To evaluate the performance of RCEG, a dedicated dataset for English reading comprehension is constructed to perform the experiments, and comprehensive evaluation metrics are used to analyze the experimental results. These metrics include content diversity, factual accuracy, linguistic toxicity, and pedagogical alignment. Experimental results show that RCEG significantly improves the relevance and cognitive appropriateness of the generated exercises.

Generating Reading Comprehension Exercises with Large Language Models for Educational Applications

TL;DR

This paper tackles automatic generation of reading comprehension exercises using large language models. It presents the Reading Comprehension Exercise Generation (RCEG) framework, which combines instruction-tuned fine-tuning, reward modeling, and PPO optimization with postgeneration control via Dynamic Attribute Graphs (DATG) and GeDi-based filtering to ensure pedagogy, safety, and quality. A trio of task-specific datasets supports the training pipeline, and a Gradio-based interactive interface enables practical use by learners and educators. Empirical results show improvements in text similarity, reasoning alignment, fluency, and safety over baselines, highlighting RCEG's potential for scalable, learner-adaptive educational content generation.

Abstract

With the rapid development of large language models (LLMs), the applications of LLMs have grown substantially. In the education domain, LLMs demonstrate significant potential, particularly in automatic text generation, which enables the creation of intelligent and adaptive learning content. This paper proposes a new LLMs framework, which is named as Reading Comprehension Exercise Generation (RCEG). It can generate high-quality and personalized English reading comprehension exercises automatically. Firstly, RCEG uses fine-tuned LLMs to generate content candidates. Then, it uses a discriminator to select the best candidate. Finally, the quality of the generated content has been improved greatly. To evaluate the performance of RCEG, a dedicated dataset for English reading comprehension is constructed to perform the experiments, and comprehensive evaluation metrics are used to analyze the experimental results. These metrics include content diversity, factual accuracy, linguistic toxicity, and pedagogical alignment. Experimental results show that RCEG significantly improves the relevance and cognitive appropriateness of the generated exercises.

Paper Structure

This paper contains 24 sections, 11 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison between Traditional Reading Comprehension Exercise Design and the RCEG Framework.
  • Figure 2: Overall framework of Reading Comprehension Exercise Generation (RCEG), including dataset preparation, two-stage training pipeline, controlled generation with guidance modules, and evaluation and visualization.
  • Figure 3: Gradio-based interactive interface for RCEG.