Table of Contents
Fetching ...

Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence

Juan Ramón Rico-Juan, Víctor M. Sánchez-Cartagena, Jose J. Valero-Mas, Antonio Javier Gallego

TL;DR

This paper addresses the lack of actionable feedback in Online Judge (OJ) systems by applying Educational Data Mining (EDM) with Multi-Instance Learning (MIL) and classical machine learning, augmented by Explainable AI (XAI) to model student behavior from submission meta-information. It introduces XOJ, a framework that maps MIL representations to ML models to enable SHAP-based explanations, providing interpretable feedback to both students and instructors. In a case study from a programming course with over 2,500 submissions from about 90 students, the MIL-to-ML approach with Random Forest achieves an AUC of around 0.70, outperforming MIL-only and bag-representation methods, and SHAP highlights near-deadline timing and submission counts as key predictors. The work demonstrates a practical path to personalized, explainable feedback and guidance for course design using OJ data, with potential for deployment across programming education contexts.

Abstract

Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemes -- particularly, multi-instance learning (MIL) and classical machine learning formulations -- to model student behavior. Besides, explainable artificial intelligence (XAI) is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2500 submissions from roughly 90 different students from a programming-related course in a computer science degree. The results obtained validate the proposal: The model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioral pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.

Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence

TL;DR

This paper addresses the lack of actionable feedback in Online Judge (OJ) systems by applying Educational Data Mining (EDM) with Multi-Instance Learning (MIL) and classical machine learning, augmented by Explainable AI (XAI) to model student behavior from submission meta-information. It introduces XOJ, a framework that maps MIL representations to ML models to enable SHAP-based explanations, providing interpretable feedback to both students and instructors. In a case study from a programming course with over 2,500 submissions from about 90 students, the MIL-to-ML approach with Random Forest achieves an AUC of around 0.70, outperforming MIL-only and bag-representation methods, and SHAP highlights near-deadline timing and submission counts as key predictors. The work demonstrates a practical path to personalized, explainable feedback and guidance for course design using OJ data, with potential for deployment across programming education contexts.

Abstract

Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemes -- particularly, multi-instance learning (MIL) and classical machine learning formulations -- to model student behavior. Besides, explainable artificial intelligence (XAI) is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2500 submissions from roughly 90 different students from a programming-related course in a computer science degree. The results obtained validate the proposal: The model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioral pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.
Paper Structure (20 sections, 1 equation, 9 figures, 2 tables)

This paper contains 20 sections, 1 equation, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Graphical representation of the scheme proposed.
  • Figure 2: Toy example of the representation of bag of elements in MIL. The three bags on the left are labelled as positive since they contain, at least, one positive element; the two on the right are tagged as negative since they contain no positive instances.
  • Figure 3: Conceptual description of the post-hoc XAI strategy considered in which, for a given submission, the system both hypothesises on the possible success/failure of the student and provides the corresponding interpretable explanation. The dashed line denotes that the Explanation block considers the Prediction as a black box, i.e., it has no access to the internal states of the method.
  • Figure 4: Stratified cross-validation schemes for the MIL (top) and ML (bottom) approaches.
  • Figure 5: Average results for the different classification schemes in terms of the AUC metric. Relative improvement with respect to the baseline case is denoted in parentheses. For a better comprehension, results are sorted in decreasing AUC score and grouped according to learning-based family.
  • ...and 4 more figures