EduAgentQG: A Multi-Agent Workflow Framework for Personalized Question Generation
Rui Jia, Min Zhang, Fengrui Liu, Bo Jiang, Kun Kuang, Zhongxiang Dai
TL;DR
EduAgentQG addresses the challenge of producing high-quality personalized math questions at scale by aligning items with explicit educational goals $E$. It introduces a five-agent, planner–writer–solver–educator–checker framework that iteratively plans, generates, evaluates, and refines questions, using multiple directions and dimension-wise binary scoring to ensure diversity and goal alignment. The contributions include (i) formal goal decomposition with a retrieval-augmented Planner, (ii) controlled candidate generation and multi-perspective refinement by the Writer, (iii) rigorous dimension-specific evaluation by the Solver and Educator, and (iv) final verification by the Checker, all within a closed-loop loop. Experiments on two mathematics datasets show superior diversity, goal consistency, and overall quality over baselines, with generalization across backbones and favorable cost–performance trade-offs. This approach provides a scalable, adaptive resource for personalized learning and automated assessment.
Abstract
High-quality personalized question banks are crucial for supporting adaptive learning and individualized assessment. Manually designing questions is time-consuming and often fails to meet diverse learning needs, making automated question generation a crucial approach to reduce teachers' workload and improve the scalability of educational resources. However, most existing question generation methods rely on single-agent or rule-based pipelines, which still produce questions with unstable quality, limited diversity, and insufficient alignment with educational goals. To address these challenges, we propose EduAgentQG, a multi-agent collaborative framework for generating high-quality and diverse personalized questions. The framework consists of five specialized agents and operates through an iterative feedback loop: the Planner generates structured design plans and multiple question directions to enhance diversity; the Writer produces candidate questions based on the plan and optimizes their quality and diversity using feedback from the Solver and Educator; the Solver and Educator perform binary scoring across multiple evaluation dimensions and feed the evaluation results back to the Writer; the Checker conducts final verification, including answer correctness and clarity, ensuring alignment with educational goals. Through this multi-agent collaboration and iterative feedback loop, EduAgentQG generates questions that are both high-quality and diverse, while maintaining consistency with educational objectives. Experiments on two mathematics question datasets demonstrate that EduAgentQG outperforms existing single-agent and multi-agent methods in terms of question diversity, goal consistency, and overall quality.
