Heterogeneity-aware Cross-school Electives Recommendation: a Hybrid Federated Approach
Chengyi Ju, Jiannong Cao, Yu Yang, Zhen-Qun Yang, Ho Man Lee
TL;DR
HFRec tackles privacy-conscious cross-school elective recommendation by addressing data sparsity and heterogeneity with per-school heterogeneous graphs and a heterogeneity-aware attention mechanism. It combines content features processed via RoBERTa with context from local graphs, trained under a federated scheme with a Constrained Matrix Factorization objective and adaptive learning rates. The authors demonstrate that HFRec outperforms privacy-preserving baselines on both open MOOC benchmarks and real-world WebSAMS data, achieving superior top-N metrics. This work advances privacy-preserving personalized education by enabling cross-institution collaboration without direct data sharing, with potential for wider deployment and future privacy enhancements.
Abstract
In the era of modern education, addressing cross-school learner diversity is crucial, especially in personalized recommender systems for elective course selection. However, privacy concerns often limit cross-school data sharing, which hinders existing methods' ability to model sparse data and address heterogeneity effectively, ultimately leading to suboptimal recommendations. In response, we propose HFRec, a heterogeneity-aware hybrid federated recommender system designed for cross-school elective course recommendations. The proposed model constructs heterogeneous graphs for each school, incorporating various interactions and historical behaviors between students to integrate context and content information. We design an attention mechanism to capture heterogeneity-aware representations. Moreover, under a federated scheme, we train individual school-based models with adaptive learning settings to recommend tailored electives. Our HFRec model demonstrates its effectiveness in providing personalized elective recommendations while maintaining privacy, as it outperforms state-of-the-art models on both open-source and real-world datasets.
