Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence
Juan Ramón Rico-Juan, Víctor M. Sánchez-Cartagena, Jose J. Valero-Mas, Antonio Javier Gallego
TL;DR
This paper addresses the lack of actionable feedback in Online Judge (OJ) systems by applying Educational Data Mining (EDM) with Multi-Instance Learning (MIL) and classical machine learning, augmented by Explainable AI (XAI) to model student behavior from submission meta-information. It introduces XOJ, a framework that maps MIL representations to ML models to enable SHAP-based explanations, providing interpretable feedback to both students and instructors. In a case study from a programming course with over 2,500 submissions from about 90 students, the MIL-to-ML approach with Random Forest achieves an AUC of around 0.70, outperforming MIL-only and bag-representation methods, and SHAP highlights near-deadline timing and submission counts as key predictors. The work demonstrates a practical path to personalized, explainable feedback and guidance for course design using OJ data, with potential for deployment across programming education contexts.
Abstract
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemes -- particularly, multi-instance learning (MIL) and classical machine learning formulations -- to model student behavior. Besides, explainable artificial intelligence (XAI) is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2500 submissions from roughly 90 different students from a programming-related course in a computer science degree. The results obtained validate the proposal: The model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioral pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.
