AI-based identification and support of at-risk students: A case study of the Moroccan education system
Ismail Elbouknify, Ismail Berrada, Loubna Mekouar, Youssef Iraqi, El Houcine Bergou, Hind Belhabib, Younes Nail, Souhail Wardi
TL;DR
This paper tackles the persistent problem of student dropout by introducing an AI-driven predictive framework that leverages extensive data from the Moroccan Ministry of National Education. The approach integrates data preprocessing, model-based dropout prediction with imbalanced-data techniques, a prediction corrector, and an intervention phase guided by explainable AI (SHAP), enabling multi-year forecasting and targeted support. Empirical evaluation on a real Moroccan dataset (1.4 million profiles, 37 features) reports strong performance, achieving $Accuracy=88\%$, $Recall=88\%$, $Precision=86\%$, and $AUC=87\%$, with LightGBM and XGBoost often leading in performance; the analysis also highlights how horizon and historical data duration influence results. The framework's generality and interpretability, paired with a rigorously designed evaluation strategy, suggest substantial practical impact for policy-making and school-level interventions across diverse education systems.
Abstract
Student dropout is a global issue influenced by personal, familial, and academic factors, with varying rates across countries. This paper introduces an AI-driven predictive modeling approach to identify students at risk of dropping out using advanced machine learning techniques. The goal is to enable timely interventions and improve educational outcomes. Our methodology is adaptable across different educational systems and levels. By employing a rigorous evaluation framework, we assess model performance and use Shapley Additive exPlanations (SHAP) to identify key factors influencing predictions. The approach was tested on real data provided by the Moroccan Ministry of National Education, achieving 88% accuracy, 88% recall, 86% precision, and an AUC of 87%. These results highlight the effectiveness of the AI models in identifying at-risk students. The framework is adaptable, incorporating historical data for both short and long-term detection, offering a comprehensive solution to the persistent challenge of student dropout.
