Table of Contents
Fetching ...

AI Meets the Classroom: When Do Large Language Models Harm Learning?

Matthias Lehmann, Philipp B. Cornelius, Fabian J. Sting

TL;DR

The paper tackles whether unrestricted LLMs help or harm learning, revealing that no overall effect emerges in controlled lab tests but outcomes hinge on how students use LLMs. By decomposing learning into topic volume and topic understanding, the authors show substitutive use (solving exercises with LLMs) boosts topic coverage at the expense of depth, while complementary use (seeking explanations) enhances understanding. Field data further reveal negative long-term effects when substitution is prevalent and show that LLMs can widen knowledge gaps by benefiting higher-knowledge students more. A key insight is that small interface changes, like enabling copy-paste, dramatically alter usage and thus learning, underscoring the need for context-aware, governance-guided LLM integration in education.

Abstract

The effect of large language models (LLMs) in education is debated: Previous research shows that LLMs can help as well as hurt learning. In two pre-registered and incentivized laboratory experiments, we find no effect of LLMs on overall learning outcomes. In exploratory analyses and a field study, we provide evidence that the effect of LLMs on learning outcomes depends on usage behavior. Students who substitute some of their learning activities with LLMs (e.g., by generating solutions to exercises) increase the volume of topics they can learn about but decrease their understanding of each topic. Students who complement their learning activities with LLMs (e.g., by asking for explanations) do not increase topic volume but do increase their understanding. We also observe that LLMs widen the gap between students with low and high prior knowledge. While LLMs show great potential to improve learning, their use must be tailored to the educational context and students' needs.

AI Meets the Classroom: When Do Large Language Models Harm Learning?

TL;DR

The paper tackles whether unrestricted LLMs help or harm learning, revealing that no overall effect emerges in controlled lab tests but outcomes hinge on how students use LLMs. By decomposing learning into topic volume and topic understanding, the authors show substitutive use (solving exercises with LLMs) boosts topic coverage at the expense of depth, while complementary use (seeking explanations) enhances understanding. Field data further reveal negative long-term effects when substitution is prevalent and show that LLMs can widen knowledge gaps by benefiting higher-knowledge students more. A key insight is that small interface changes, like enabling copy-paste, dramatically alter usage and thus learning, underscoring the need for context-aware, governance-guided LLM integration in education.

Abstract

The effect of large language models (LLMs) in education is debated: Previous research shows that LLMs can help as well as hurt learning. In two pre-registered and incentivized laboratory experiments, we find no effect of LLMs on overall learning outcomes. In exploratory analyses and a field study, we provide evidence that the effect of LLMs on learning outcomes depends on usage behavior. Students who substitute some of their learning activities with LLMs (e.g., by generating solutions to exercises) increase the volume of topics they can learn about but decrease their understanding of each topic. Students who complement their learning activities with LLMs (e.g., by asking for explanations) do not increase topic volume but do increase their understanding. We also observe that LLMs widen the gap between students with low and high prior knowledge. While LLMs show great potential to improve learning, their use must be tailored to the educational context and students' needs.
Paper Structure (29 sections, 5 equations, 2 figures, 19 tables)

This paper contains 29 sections, 5 equations, 2 figures, 19 tables.

Figures (2)

  • Figure 1: Main experiment interface.
  • Figure 2: Treatment Condition Interface.