Counterfactual Fairness Evaluation of Machine Learning Models on Educational Datasets
Woojin Kim, Hyeoncheol Kim
TL;DR
This work addresses fairness in educational machine learning by evaluating counterfactual fairness in educational data using a Structural Causal Model (SCM) framework. It adopts a Level 1 counterfactual fairness approach, comparing an unfair baseline, a fairness-through-unawareness baseline, and a counterfactually fair model across three benchmark datasets (Law School, OULAD, Student Performance) with four common predictors (LR, MLP, RF, XGB) and multiple fairness metrics, including Wasserstein Distance, MMD, ABROCA, and MADD. The results show that counterfactual fairness can substantially reduce distributional disparities across sensitive groups, but may incur predictive performance trade-offs that vary by dataset and model class. The study advances causal-aware fairness in education, highlights the value of combining causal and statistical fairness notions, and points to future work on richer causal models (Level 2) and mechanisms to balance fairness with predictive accuracy in educational settings.
Abstract
As machine learning models are increasingly used in educational settings, from detecting at-risk students to predicting student performance, algorithmic bias and its potential impacts on students raise critical concerns about algorithmic fairness. Although group fairness is widely explored in education, works on individual fairness in a causal context are understudied, especially on counterfactual fairness. This paper explores the notion of counterfactual fairness for educational data by conducting counterfactual fairness analysis of machine learning models on benchmark educational datasets. We demonstrate that counterfactual fairness provides meaningful insight into the causality of sensitive attributes and causal-based individual fairness in education.
