Uncovering Student Engagement Patterns in Moodle with Interpretable Machine Learning
Laura J. Johnston, Jim E. Griffin, Ioanna Manolopoulou, Takoua Jendoubi
TL;DR
The paper addresses the challenge of quantifying student engagement using VLE log data by proposing a weekly, chapter-structured engagement metric that combines immediacy, frequency, and diversity. It evaluates nine regression models, with a strong emphasis on interpretability through Generalised Additive Models (GAM) and predictive strength via Random Forests, using nested cross-validation. In a case study of a UCL computing module, the authors identify early weeks and pre-assessment periods as critical for engagement, while the impact of delivery method remains inconclusive due to confounding factors. The work contributes to learning analytics by refining engagement measurement, demonstrating actionable weekly predictors, and enabling data-driven teaching strategies for proactive student support.
Abstract
Understanding and enhancing student engagement through digital platforms is critical in higher education. This study introduces a methodology for quantifying engagement across an entire module using virtual learning environment (VLE) activity log data. Using study session frequency, immediacy, and diversity, we create a cumulative engagement metric and model it against weekly VLE interactions with resources to identify critical periods and resources predictive of student engagement. In a case study of a computing module at University College London's Department of Statistical Science, we further examine how delivery methods (online, hybrid, in-person) impact student behaviour. Across nine regression models, we validate the consistency of the random forest model and highlight the interpretive strengths of generalised additive models for analysing engagement patterns. Results show weekly VLE clicks as reliable engagement predictors, with early weeks and the first assessment period being key. However, the impact of delivery methods on engagement is inconclusive due to inconsistencies across models. These findings support early intervention strategies to assist students at risk of disengagement. This work contributes to learning analytics research by proposing a refined VLE-based engagement metric and advancing data-driven teaching strategies in higher education.
