Table of Contents
Fetching ...

Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models

Qiming Bao, Juho Leinonen, Alex Yuxuan Peng, Wanjun Zhong, Gaël Gendron, Timothy Pistotti, Alice Huang, Paul Denny, Michael Witbrock, Jiamou Liu

TL;DR

This work targets the challenge of generating high-quality, student-aligned explanations for learner-created MCQs on platforms like PeerWise. It introduces ILearner-LLM, an iterative framework with a generation and an evaluation LLM that repeatedly refines explanations by feeding the evaluation score back into the next prompt, using up to $K$ iterations. Empirical results on five PeerWise datasets show that instruction-tuned LLMs (notably LLaMA2-13B and GPT-4) achieve higher BLEU and BERT alignment to student explanations and that fine-tuning the evaluation model improves rating accuracy (lower MSE), with merged, diverse training data providing further gains. The approach demonstrates a promising path to enrich learnersourcing workflows and enhance educational use of large language models.

Abstract

Large language models exhibit superior capabilities in processing and understanding language, yet their applications in educational contexts remain underexplored. Learnersourcing enhances learning by engaging students in creating their own educational content. When learnersourcing multiple-choice questions, creating explanations for the solution of a question is a crucial step; it helps other students understand the solution and promotes a deeper understanding of related concepts. However, it is often difficult for students to craft effective solution explanations, due to limited subject understanding. To help scaffold the task of automated explanation generation, we present and evaluate a framework called "ILearner-LLM", that iteratively enhances the generated explanations for the given questions with large language models. Comprising an explanation generation model and an explanation evaluation model, the framework generates high-quality student-aligned explanations by iteratively feeding the quality rating score from the evaluation model back into the instruction prompt of the explanation generation model. Experimental results demonstrate the effectiveness of our ILearner-LLM on LLaMA2-13B and GPT-4 to generate higher quality explanations that are closer to those written by students on five PeerWise datasets. Our findings represent a promising path to enrich the learnersourcing experience for students and to enhance the capabilities of large language models for educational applications.

Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models

TL;DR

This work targets the challenge of generating high-quality, student-aligned explanations for learner-created MCQs on platforms like PeerWise. It introduces ILearner-LLM, an iterative framework with a generation and an evaluation LLM that repeatedly refines explanations by feeding the evaluation score back into the next prompt, using up to iterations. Empirical results on five PeerWise datasets show that instruction-tuned LLMs (notably LLaMA2-13B and GPT-4) achieve higher BLEU and BERT alignment to student explanations and that fine-tuning the evaluation model improves rating accuracy (lower MSE), with merged, diverse training data providing further gains. The approach demonstrates a promising path to enrich learnersourcing workflows and enhance educational use of large language models.

Abstract

Large language models exhibit superior capabilities in processing and understanding language, yet their applications in educational contexts remain underexplored. Learnersourcing enhances learning by engaging students in creating their own educational content. When learnersourcing multiple-choice questions, creating explanations for the solution of a question is a crucial step; it helps other students understand the solution and promotes a deeper understanding of related concepts. However, it is often difficult for students to craft effective solution explanations, due to limited subject understanding. To help scaffold the task of automated explanation generation, we present and evaluate a framework called "ILearner-LLM", that iteratively enhances the generated explanations for the given questions with large language models. Comprising an explanation generation model and an explanation evaluation model, the framework generates high-quality student-aligned explanations by iteratively feeding the quality rating score from the evaluation model back into the instruction prompt of the explanation generation model. Experimental results demonstrate the effectiveness of our ILearner-LLM on LLaMA2-13B and GPT-4 to generate higher quality explanations that are closer to those written by students on five PeerWise datasets. Our findings represent a promising path to enrich the learnersourcing experience for students and to enhance the capabilities of large language models for educational applications.
Paper Structure (17 sections, 1 figure, 5 tables, 1 algorithm)

This paper contains 17 sections, 1 figure, 5 tables, 1 algorithm.

Figures (1)

  • Figure 1: Architecture of the iterative enhancement framework "ILearner-LLM" using large language models for multiple-choice question explanation generation and evaluation.

Theorems & Definitions (1)

  • Definition 1