The "Huh?" Button: Improving Understanding in Educational Videos with Large Language Models
Boris Ruf, Marcin Detyniecki
TL;DR
This work addresses comprehension gaps in video-based education by integrating on-demand, LLM-generated explanations triggered by a lightweight 'Huh?' button that uses the video transcript to rephrase and elaborate the last spoken content. The proposed pipeline halts playback, queries an LLM for targeted explanations, and offers a second-level simplification if needed, with a caching-aware YouTube plugin to enable scalable deployment. A proof-of-concept demonstrates feasibility across multiple lectures and languages, including a JavaScript plugin and pre-produced explanations to minimize latency. An environmental analysis highlights caching as a key strategy to reduce emissions, and future work calls for rigorous user studies and improved handling of model hallucinations to enhance educational impact.
Abstract
We propose a simple way to use large language models (LLMs) in education. Specifically, our method aims to improve individual comprehension by adding a novel feature to online videos. We combine the low threshold for interactivity in digital experiences with the benefits of rephrased and elaborated explanations typical of face-to-face interactions, thereby supporting to close knowledge gaps at scale. To demonstrate the technical feasibility of our approach, we conducted a proof-of-concept experiment and implemented a prototype which is available for testing online. Through the use case, we also show how caching can be applied in LLM-powered applications to reduce their carbon footprint.
