Intelligent Interface: Enhancing Lecture Engagement with Didactic Activity Summaries
Anna Wróblewska, Marcel Witas, Kinga Frańczak, Arkadiusz Kniaź, Siew Ann Cheong, Tan Seng Chee, Janusz Hołyst, Marcin Paprzycki
TL;DR
This work tackles the lack of lecturer-focused AI tools by developing a modular prototype that analyzes lecture videos to detect didactic features and generate visual summaries for feedback. It combines video/audio processing, automatic transcription, and transformer-based text modeling to identify teaching behaviors and present actionable insights to instructors. Key contributions include a defined didactic feature taxonomy, a labeled dataset of 128 physics lectures with 380 observations, a six-module system architecture, and an interactive interface for feedback and visualization. The study demonstrates feasibility and points to future work on larger datasets and continual learning with human oversight to scale lecturer support in education cases.
Abstract
Recently, multiple applications of machine learning have been introduced. They include various possibilities arising when image analysis methods are applied to, broadly understood, video streams. In this context, a novel tool, developed for academic educators to enhance the teaching process by automating, summarizing, and offering prompt feedback on conducting lectures, has been developed. The implemented prototype utilizes machine learning-based techniques to recognise selected didactic and behavioural teachers' features within lecture video recordings. Specifically, users (teachers) can upload their lecture videos, which are preprocessed and analysed using machine learning models. Next, users can view summaries of recognized didactic features through interactive charts and tables. Additionally, stored ML-based prediction results support comparisons between lectures based on their didactic content. In the developed application text-based models trained on lecture transcriptions, with enhancements to the transcription quality, by adopting an automatic speech recognition solution are applied. Furthermore, the system offers flexibility for (future) integration of new/additional machine-learning models and software modules for image and video analysis.
