Table of Contents
Fetching ...

Optimizing Coverage and Difficulty in Reinforcement Learning for Quiz Composition

Ricardo Pedro Querido Andrade Silva, Nassim Bouarour, Dina Fettache, Sarab Boussouar, Noha Ibrahim, Sihem Amer-Yahia

Abstract

Quiz design is a tedious process that teachers undertake to evaluate the acquisition of knowledge by students. Our goal in this paper is to automate quiz composition from a set of multiple choice questions (MCQs). We formalize a generic sequential decision-making problem with the goal of training an agent to compose a quiz that meets the desired topic coverage and difficulty levels. We investigate DQN, SARSA and A2C/A3C, three reinforcement learning solutions to solve our problem. We run extensive experiments on synthetic and real datasets that study the ability of RL to land on the best quiz. Our results reveal subtle differences in agent behavior and in transfer learning with different data distributions and teacher goals. This was supported by our user study, paving the way for automating various teachers' pedagogical goals.

Optimizing Coverage and Difficulty in Reinforcement Learning for Quiz Composition

Abstract

Quiz design is a tedious process that teachers undertake to evaluate the acquisition of knowledge by students. Our goal in this paper is to automate quiz composition from a set of multiple choice questions (MCQs). We formalize a generic sequential decision-making problem with the goal of training an agent to compose a quiz that meets the desired topic coverage and difficulty levels. We investigate DQN, SARSA and A2C/A3C, three reinforcement learning solutions to solve our problem. We run extensive experiments on synthetic and real datasets that study the ability of RL to land on the best quiz. Our results reveal subtle differences in agent behavior and in transfer learning with different data distributions and teacher goals. This was supported by our user study, paving the way for automating various teachers' pedagogical goals.

Paper Structure

This paper contains 21 sections, 13 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: MCQs used to generate quizzes.
  • Figure 2: Evolution of max $Q$-function across training iterations of DQN on target $T_{\mathit{uniform}}$ for different datasets and $\alpha$ values.
  • Figure 3: Evolution of learned actions per episode during DQN training on target $T_{\mathit{uniform}}$ for Uniform dataset.
  • Figure 4: Evolution of learned actions per episode during DQN training on target $T_{\mathit{bias}}$ for Uniform dataset.
  • Figure 5: Examples of agent trajectories in Uniform dataset using UMAP 2D projection.
  • ...and 7 more figures