Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure
Romain Puech, Jakub Macina, Julia Chatain, Mrinmaya Sachan, Manu Kapur
TL;DR
This work defines Pedagogical Steering for large language-model tutors and presents StratL, a three-part algorithm that steers an LLM through a PF-inspired multi-turn tutoring plan using a transition-graph of intents. By combining a Teacher State Tracing classifier with an expert-defined Intent Selection graph, StratL aims to maximize productive failure—prompting students to generate multiple representations rather than directly revealing solutions. The approach is validated via a simulated study and a field test with 17 ninth-grade students, showing that StratL increases PF fidelity (more student-generated representations) without degrading core tutor qualities like coherence or empathy. The paper also releases a PF problem dataset and source code, discusses practical limitations, and outlines opportunities for classroom integration and future enhancements.
Abstract
One-to-one tutoring is one of the most efficient methods of teaching. With the growing popularity of Large Language Models (LLMs), there have been efforts to create LLM based conversational tutors which can expand the benefits of one to one tutoring to everyone. However, current LLMs are trained primarily to be helpful assistants and lack crucial pedagogical skills. For example, they often quickly reveal the solution to the student and fail to plan for a richer multi turn pedagogical interaction. To use LLMs in pedagogical settings, they need to be steered to use effective teaching strategies: a problem we introduce as Pedagogical Steering. We develop StratL, an algorithm to optimize LLM prompts and steer it to follow a predefined multi-turn tutoring plan represented as a transition graph. As a case study, we create a prototype tutor for high school math following Productive Failure (PF), an advanced and effective learning design. To validate our approach in a real-world setting, we run a field study with 17 high school students in Singapore and show that StratL succeeds in steering the LLM to follow the PF tutoring strategy. Finally, we highlight challenges in Pedagogical Steering of LLMs and offer opportunities for further improvements by publishing a dataset of PF problems and our code.
