Table of Contents
Fetching ...

Enhancing Student Performance Prediction on Learnersourced Questions with SGNN-LLM Synergy

Lin Ni, Sijie Wang, Zeyu Zhang, Xiaoxuan Li, Xianda Zheng, Paul Denny, Jiamou Liu

TL;DR

Predicting student performance on learnersourced MCQs under noisy data and cold-start conditions is tackled with a Signed Bipartite Graph Contrastive Learning (SBCL) framework augmented by LLM-derived semantic embeddings. The method employs graph augmentation and dual GNN encoders to learn edge signs, while the LLM supplies question-level knowledge points joined with structural embeddings for robust predictions. Key contributions include formalizing sign prediction on signed bipartite graphs, introducing inter-/intra-view contrastive learning, and validating semantic augmentation across five PeerWise datasets with leading performance, including high F1 scores. This work enhances robustness and personalization in learnersourcing platforms by leveraging both network structure and semantic content of questions.

Abstract

Learnersourcing offers great potential for scalable education through student content creation. However, predicting student performance on learnersourced questions, which is essential for personalizing the learning experience, is challenging due to the inherent noise in student-generated data. Moreover, while conventional graph-based methods can capture the complex network of student and question interactions, they often fall short under cold start conditions where limited student engagement with questions yields sparse data. To address both challenges, we introduce an innovative strategy that synergizes the potential of integrating Signed Graph Neural Networks (SGNNs) and Large Language Model (LLM) embeddings. Our methodology employs a signed bipartite graph to comprehensively model student answers, complemented by a contrastive learning framework that enhances noise resilience. Furthermore, LLM's contribution lies in generating foundational question embeddings, proving especially advantageous in addressing cold start scenarios characterized by limited graph data. Validation across five real-world datasets sourced from the PeerWise platform underscores our approach's effectiveness. Our method outperforms baselines, showcasing enhanced predictive accuracy and robustness.

Enhancing Student Performance Prediction on Learnersourced Questions with SGNN-LLM Synergy

TL;DR

Predicting student performance on learnersourced MCQs under noisy data and cold-start conditions is tackled with a Signed Bipartite Graph Contrastive Learning (SBCL) framework augmented by LLM-derived semantic embeddings. The method employs graph augmentation and dual GNN encoders to learn edge signs, while the LLM supplies question-level knowledge points joined with structural embeddings for robust predictions. Key contributions include formalizing sign prediction on signed bipartite graphs, introducing inter-/intra-view contrastive learning, and validating semantic augmentation across five PeerWise datasets with leading performance, including high F1 scores. This work enhances robustness and personalization in learnersourcing platforms by leveraging both network structure and semantic content of questions.

Abstract

Learnersourcing offers great potential for scalable education through student content creation. However, predicting student performance on learnersourced questions, which is essential for personalizing the learning experience, is challenging due to the inherent noise in student-generated data. Moreover, while conventional graph-based methods can capture the complex network of student and question interactions, they often fall short under cold start conditions where limited student engagement with questions yields sparse data. To address both challenges, we introduce an innovative strategy that synergizes the potential of integrating Signed Graph Neural Networks (SGNNs) and Large Language Model (LLM) embeddings. Our methodology employs a signed bipartite graph to comprehensively model student answers, complemented by a contrastive learning framework that enhances noise resilience. Furthermore, LLM's contribution lies in generating foundational question embeddings, proving especially advantageous in addressing cold start scenarios characterized by limited graph data. Validation across five real-world datasets sourced from the PeerWise platform underscores our approach's effectiveness. Our method outperforms baselines, showcasing enhanced predictive accuracy and robustness.
Paper Structure (12 sections, 9 equations, 5 figures, 3 tables)

This paper contains 12 sections, 9 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: A scenario for the signed bipartite graph
  • Figure 2: The Framework of LLM-SBCL model
  • Figure 3: The framework of SBCL
  • Figure 4: Inter-view and Intra-view contrastive loss
  • Figure 5: Flatten MCQ component key-value pairs into a sequence to form an MCQ "sentence"