Harnessing LLM for Noise-Robust Cognitive Diagnosis in Web-Based Intelligent Education Systems
Guixian Zhang, Guan Yuan, Ziqi Xu, Yanmei Zhang, Jing Ren, Zhenyun Deng, Debo Cheng
TL;DR
The paper tackles noise and data-imbalance challenges in cognitive diagnostics for Web-based Intelligent Education Systems by proposing DLLM, a diffusion-based LLM framework. DLLM integrates three components: (i) Relation Augmentation Alignment to address data imbalance via augmented subgraphs and contrastive learning, (ii) Semantic Augmentation Alignment to inject LLM-derived semantic knowledge through exercisedescriptions and student profiles, and (iii) a two-stage diffusion module (unconditional and graph-conditioned) to denoise representations before alignment. The approach yields noise-robust, semantically informed student and exercise embeddings that feed into existing CDMs, achieving state-of-the-art results on three public web-education datasets under varying noise levels. Practically, DLLM enhances reliability and interpretability of cognitive diagnostics in large-scale, open educational environments where logs are noisy and incomplete.
Abstract
Cognitive diagnostics in the Web-based Intelligent Education System (WIES) aims to assess students' mastery of knowledge concepts from heterogeneous, noisy interactions. Recent work has tried to utilize Large Language Models (LLMs) for cognitive diagnosis, yet LLMs struggle with structured data and are prone to noise-induced misjudgments. Specially, WIES's open environment continuously attracts new students and produces vast amounts of response logs, exacerbating the data imbalance and noise issues inherent in traditional educational systems. To address these challenges, we propose DLLM, a Diffusion-based LLM framework for noise-robust cognitive diagnosis. DLLM first constructs independent subgraphs based on response correctness, then applies relation augmentation alignment module to mitigate data imbalance. The two subgraph representations are then fused and aligned with LLM-derived, semantically augmented representations. Importantly, before each alignment step, DLLM employs a two-stage denoising diffusion module to eliminate intrinsic noise while assisting structural representation alignment. Specifically, unconditional denoising diffusion first removes erroneous information, followed by conditional denoising diffusion based on graph-guided to eliminate misleading information. Finally, the noise-robust representation that integrates semantic knowledge and structural information is fed into existing cognitive diagnosis models for prediction. Experimental results on three publicly available web-based educational platform datasets demonstrate that our DLLM achieves optimal predictive performance across varying noise levels, which demonstrates that DLLM achieves noise robustness while effectively leveraging semantic knowledge from LLM.
