LLM-Generated Tips Rival Expert-Created Tips in Helping Students Answer Quantum-Computing Questions
Lars Krupp, Jonas Bley, Isacco Gobbi, Alexander Geng, Sabine Müller, Sungho Suh, Ali Moghiseh, Arcesio Castaneda Medina, Valeria Bartsch, Artur Widera, Herwig Ott, Paul Lukowicz, Jakob Karolus, Maximilian Kiefer-Emmanouilidis
TL;DR
The study addresses the shortage of quantum computing educators by evaluating whether LLM-generated tips can match expert-created tips for teaching QC basics. It conducts two complementary studies: a main between-subject study (N=46) comparing tip creator and label, and a tip-evaluation study (N=23) with educators and students to assess tip quality, correctness, and impact on learning. Findings show LLM-generated tips rival expert tips in usefulness and conceptual focus, with tips labeled as llm-generated sometimes boosting performance via placebo effects, while also risking giving away the answer and being more verbose. The work supports integrating LLM-generated tips into scalable, personalized education for QC basics, provided that design considerations, validation, and human-in-the-loop checks are in place to mitigate risks and preserve learning gains.
Abstract
Individual teaching is among the most successful ways to impart knowledge. Yet, this method is not always feasible due to large numbers of students per educator. Quantum computing serves as a prime example facing this issue, due to the hype surrounding it. Alleviating high workloads for teachers, often accompanied with individual teaching, is crucial for continuous high quality education. Therefore, leveraging Large Language Models (LLMs) such as GPT-4 to generate educational content can be valuable. We conducted two complementary studies exploring the feasibility of using GPT-4 to automatically generate tips for students. In the first one students (N=46) solved four multiple-choice quantum computing questions with either the help of expert-created or LLM-generated tips. To correct for possible biases towards LLMs, we introduced two additional conditions, making some participants believe that they were given expert-created tips, when they were given LLM-generated tips and vice versa. Our second study (N=23) aimed to directly compare the LLM-generated and expert-created tips, evaluating their quality, correctness and helpfulness, with both experienced educators and students participating. Participants in our second study found that the LLM-generated tips were significantly more helpful and pointed better towards relevant concepts than the expert-created tips, while being more prone to be giving away the answer. While participants in the first study performed significantly better in answering the quantum computing questions when given tips labeled as LLM-generated, even if they were created by an expert. This phenomenon could be a placebo effect induced by the participants' biases for LLM-generated content. Ultimately, we find that LLM-generated tips are good enough to be used instead of expert tips in the context of quantum computing basics.
