Table of Contents
Fetching ...

Fair Knowledge Tracing in Second Language Acquisition

Weitao Tang, Guanliang Chen, Shuaishuai Zu, Jiangyi Luo

TL;DR

The study investigates algorithmic fairness in second-language acquisition knowledge tracing by comparing two predictive paradigms, gradient boosted decision trees and multi-task learning, on the Duolingo en_es, es_en, and fr_en tracks. It employs ABROCA, a threshold-free fairness metric, to evaluate disparities across client platforms (iOS, Android, Web) and country development (developed vs developing). Across tracks, multi-task learning generally achieves higher accuracy (F1) with comparable or better fairness than GBDT, though performance and fairness vary by track, with fr_en favoring GBDT in several metrics. The findings highlight that deep learning can improve both predictive accuracy and fairness in SLA, but platform and country biases remain, underscoring the need for context-specific algorithm selection and fairness mitigation to ensure equitable learning experiences.

Abstract

In second-language acquisition, predictive modeling aids educators in implementing diverse teaching strategies, attracting significant research attention. However, while model accuracy is widely explored, model fairness remains under-examined. Model fairness ensures equitable treatment of groups, preventing unintentional biases based on attributes such as gender, ethnicity, or economic background. A fair model should produce impartial outcomes that do not systematically disadvantage any group. This study evaluates the fairness of two predictive models using the Duolingo dataset's en\_es (English learners speaking Spanish), es\_en (Spanish learners speaking English), and fr\_en (French learners speaking English) tracks. We analyze: 1. Algorithmic fairness across platforms (iOS, Android, Web). 2. Algorithmic fairness between developed and developing countries. Key findings include: 1. Deep learning outperforms machine learning in second-language knowledge tracing due to improved accuracy and fairness. 2. Both models favor mobile users over non-mobile users. 3. Machine learning exhibits stronger bias against developing countries compared to deep learning. 4. Deep learning strikes a better balance of fairness and accuracy in the en\_es and es\_en tracks, while machine learning is more suitable for fr\_en. This study highlights the importance of addressing fairness in predictive models to ensure equitable educational strategies across platforms and regions.

Fair Knowledge Tracing in Second Language Acquisition

TL;DR

The study investigates algorithmic fairness in second-language acquisition knowledge tracing by comparing two predictive paradigms, gradient boosted decision trees and multi-task learning, on the Duolingo en_es, es_en, and fr_en tracks. It employs ABROCA, a threshold-free fairness metric, to evaluate disparities across client platforms (iOS, Android, Web) and country development (developed vs developing). Across tracks, multi-task learning generally achieves higher accuracy (F1) with comparable or better fairness than GBDT, though performance and fairness vary by track, with fr_en favoring GBDT in several metrics. The findings highlight that deep learning can improve both predictive accuracy and fairness in SLA, but platform and country biases remain, underscoring the need for context-specific algorithm selection and fairness mitigation to ensure equitable learning experiences.

Abstract

In second-language acquisition, predictive modeling aids educators in implementing diverse teaching strategies, attracting significant research attention. However, while model accuracy is widely explored, model fairness remains under-examined. Model fairness ensures equitable treatment of groups, preventing unintentional biases based on attributes such as gender, ethnicity, or economic background. A fair model should produce impartial outcomes that do not systematically disadvantage any group. This study evaluates the fairness of two predictive models using the Duolingo dataset's en\_es (English learners speaking Spanish), es\_en (Spanish learners speaking English), and fr\_en (French learners speaking English) tracks. We analyze: 1. Algorithmic fairness across platforms (iOS, Android, Web). 2. Algorithmic fairness between developed and developing countries. Key findings include: 1. Deep learning outperforms machine learning in second-language knowledge tracing due to improved accuracy and fairness. 2. Both models favor mobile users over non-mobile users. 3. Machine learning exhibits stronger bias against developing countries compared to deep learning. 4. Deep learning strikes a better balance of fairness and accuracy in the en\_es and es\_en tracks, while machine learning is more suitable for fr\_en. This study highlights the importance of addressing fairness in predictive models to ensure equitable educational strategies across platforms and regions.

Paper Structure

This paper contains 16 sections, 29 figures, 6 tables.

Figures (29)

  • Figure 1: Intelligent tutoring system [Abdelrahman2023].
  • Figure 2: Fairness between iOS and Android in en_es track for GBDT
  • Figure 3: Fairness between iOS and Android in en_es track for Multi-task learning
  • Figure 4: Fairness between developed country and developing country in en_es track for GBDT
  • Figure 5: Fairness between developed country and developing country in en_es track for Multi-task learning
  • ...and 24 more figures