Table of Contents
Fetching ...

GPT-4 as a Homework Tutor can Improve Student Engagement and Learning Outcomes

Alessandro Vanzo, Sankalan Pal Chowdhury, Mrinmaya Sachan

TL;DR

This work developed a prompting strategy that enables GPT-4 to conduct interactive homework sessions for high-school students learning English as a second language, and observed significant improvements in learning outcomes, specifically a greater gain in grammar, and student engagement.

Abstract

This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools. Homework is an important part of education in schools across the world, but in order to maximize benefit, it needs to be accompanied with feedback and followup questions. We developed a prompting strategy that enables GPT-4 to conduct interactive homework sessions for high-school students learning English as a second language. Our strategy requires minimal efforts in content preparation, one of the key challenges of alternatives like home tutors or ITSs. We carried out a Randomized Controlled Trial (RCT) in four high-school classes, replacing traditional homework with GPT-4 homework sessions for the treatment group. We observed significant improvements in learning outcomes, specifically a greater gain in grammar, and student engagement. In addition, students reported high levels of satisfaction with the system and wanted to continue using it after the end of the RCT.

GPT-4 as a Homework Tutor can Improve Student Engagement and Learning Outcomes

TL;DR

This work developed a prompting strategy that enables GPT-4 to conduct interactive homework sessions for high-school students learning English as a second language, and observed significant improvements in learning outcomes, specifically a greater gain in grammar, and student engagement.

Abstract

This work contributes to the scarce empirical literature on LLM-based interactive homework in real-world educational settings and offers a practical, scalable solution for improving homework in schools. Homework is an important part of education in schools across the world, but in order to maximize benefit, it needs to be accompanied with feedback and followup questions. We developed a prompting strategy that enables GPT-4 to conduct interactive homework sessions for high-school students learning English as a second language. Our strategy requires minimal efforts in content preparation, one of the key challenges of alternatives like home tutors or ITSs. We carried out a Randomized Controlled Trial (RCT) in four high-school classes, replacing traditional homework with GPT-4 homework sessions for the treatment group. We observed significant improvements in learning outcomes, specifically a greater gain in grammar, and student engagement. In addition, students reported high levels of satisfaction with the system and wanted to continue using it after the end of the RCT.
Paper Structure (36 sections, 3 figures, 4 tables)

This paper contains 36 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Illustration of the study design. We ask the teacher to provide the weekly homework exercises. For each exercise, we ask for three elements: purpose, a brief, informal description of the learning goals; description, outlining what the student is asked to do; example, with an instance of the homework the student would typically be assigned. We prompt GPT-4 with a description of the tutoring task and with the 3 elements of the exercise, asking to cover the same concepts and pedagogical purpose. Finally, we test the effectiveness of the tutoring as a replacement for standard homework in an RCT.
  • Figure 2: Change in Survey responses between the initial and the final questionnaire. Questions with negative sentiment are flipped(these are marked with a $\dagger$ in tables \ref{['tab:app_initial_q']} and \ref{['tab:app_final_q']}) so higher is always an improvement. Questions regarding confidence are marked in red borders. Questions regarding homework are marked in blue borders. We notice that treatment group does better than control group on all questions regarding homework, which is a good sign for GPT4. We also note that they do worse in questions about ability
  • Figure 3: Weekly distribution of student ratings for usefulness, interestingness, comprehensiveness and level_of_resources. Green triangles show means We observe no significant decline in any measure over time so there is no support for the existence of novelty effects