From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions
Ruben Weijers, Denton Wu, Hannah Betts, Tamara Jacod, Yuxiang Guan, Vidya Sujaya, Kushal Dev, Toshali Goel, William Delooze, Reihaneh Rabbany, Ying Wu, Jean-François Godbout, Kellin Pelrine
TL;DR
The paper investigates an AI 'Peer' designed to correct deep physics misconceptions in Newtonian mechanics via targeted dialogue, treating AI as fallible rather than authoritative. In a randomized trial with 165 undergraduates, the treatment group receiving misconception-focused AI interactions showed a significant post-test advantage of about 10.5 percentage points and a notably higher learning gain than the control group, despite the AI sometimes answering incorrectly. The study combines quantitative gains using a modified Force Concept Inventory with qualitative human evaluation of AI outputs, finding that the learning process during dialogue—rather than perfect AI accuracy—drives improvement. These findings suggest AI Peers can support constructive learning and critical thinking, while highlighting design considerations to manage AI reliability, engagement, and pedagogy for scalable, AI-assisted education.
Abstract
Generative AI has the potential to transform personalization and accessibility of education. However, it raises serious concerns about accuracy and helping students become independent critical thinkers. In this study, we designed a helpful AI "Peer" to help students correct fundamental physics misconceptions related to Newtonian mechanic concepts. In contrast to approaches that seek near-perfect accuracy to create an authoritative AI tutor or teacher, we directly inform students that this AI can answer up to 40% of questions incorrectly. In a randomized controlled trial with 165 students, those who engaged in targeted dialogue with the AI Peer achieved post-test scores that were, on average, 10.5 percentage points higher - with over 20 percentage points higher normalized gain - than a control group that discussed physics history. Qualitative feedback indicated that 91% of the treatment group's AI interactions were rated as helpful. Furthermore, by comparing student performance on pre- and post-test questions about the same concept, along with experts' annotations of the AI interactions, we find initial evidence suggesting the improvement in performance does not depend on the correctness of the AI. With further research, the AI Peer paradigm described here could open new possibilities for how we learn, adapt to, and grow with AI.
