Table of Contents
Fetching ...

DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions

Nigel Fernandez, Alexander Scarlatos, Wanyong Feng, Simon Woodhead, Andrew Lan

TL;DR

This paper introduces DiVERT (Distractor Generation with Variational Errors Represented as Text), a novel variational approach that learns an interpretable representation of errors behind distractors in math MCQs and finds that DiVERT leads to error labels that are of comparable quality to human-authored ones.

Abstract

High-quality distractors are crucial to both the assessment and pedagogical value of multiple-choice questions (MCQs), where manually crafting ones that anticipate knowledge deficiencies or misconceptions among real students is difficult. Meanwhile, automated distractor generation, even with the help of large language models (LLMs), remains challenging for subjects like math. It is crucial to not only identify plausible distractors but also understand the error behind them. In this paper, we introduce DiVERT (Distractor Generation with Variational Errors Represented as Text), a novel variational approach that learns an interpretable representation of errors behind distractors in math MCQs. Through experiments on a real-world math MCQ dataset with 1,434 questions used by hundreds of thousands of students, we show that DiVERT, despite using a base open-source LLM with 7B parameters, outperforms state-of-the-art approaches using GPT-4o on downstream distractor generation. We also conduct a human evaluation with math educators and find that DiVERT leads to error labels that are of comparable quality to human-authored ones.

DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions

TL;DR

This paper introduces DiVERT (Distractor Generation with Variational Errors Represented as Text), a novel variational approach that learns an interpretable representation of errors behind distractors in math MCQs and finds that DiVERT leads to error labels that are of comparable quality to human-authored ones.

Abstract

High-quality distractors are crucial to both the assessment and pedagogical value of multiple-choice questions (MCQs), where manually crafting ones that anticipate knowledge deficiencies or misconceptions among real students is difficult. Meanwhile, automated distractor generation, even with the help of large language models (LLMs), remains challenging for subjects like math. It is crucial to not only identify plausible distractors but also understand the error behind them. In this paper, we introduce DiVERT (Distractor Generation with Variational Errors Represented as Text), a novel variational approach that learns an interpretable representation of errors behind distractors in math MCQs. Through experiments on a real-world math MCQ dataset with 1,434 questions used by hundreds of thousands of students, we show that DiVERT, despite using a base open-source LLM with 7B parameters, outperforms state-of-the-art approaches using GPT-4o on downstream distractor generation. We also conduct a human evaluation with math educators and find that DiVERT leads to error labels that are of comparable quality to human-authored ones.
Paper Structure (46 sections, 8 equations, 3 figures, 14 tables)

This paper contains 46 sections, 8 equations, 3 figures, 14 tables.

Figures (3)

  • Figure 1: Overview of DiVERT's variational pipeline for error explanation and distractor generation in math MCQs.
  • Figure 2: Distractor generation performance with increasing percentages of error labels dropped (unused in training). DiVERT outperforms baselines, especially when only a small number of error labels are used.
  • Figure 3: Distractor generation Prop@10 performance with an increasing percentage of data used for variational training. Sampling errors from $q_{\phi}$ on all train question-distractor pairs performs best.