Table of Contents
Fetching ...

Harnessing LLM for Noise-Robust Cognitive Diagnosis in Web-Based Intelligent Education Systems

Guixian Zhang, Guan Yuan, Ziqi Xu, Yanmei Zhang, Jing Ren, Zhenyun Deng, Debo Cheng

TL;DR

The paper tackles noise and data-imbalance challenges in cognitive diagnostics for Web-based Intelligent Education Systems by proposing DLLM, a diffusion-based LLM framework. DLLM integrates three components: (i) Relation Augmentation Alignment to address data imbalance via augmented subgraphs and contrastive learning, (ii) Semantic Augmentation Alignment to inject LLM-derived semantic knowledge through exercisedescriptions and student profiles, and (iii) a two-stage diffusion module (unconditional and graph-conditioned) to denoise representations before alignment. The approach yields noise-robust, semantically informed student and exercise embeddings that feed into existing CDMs, achieving state-of-the-art results on three public web-education datasets under varying noise levels. Practically, DLLM enhances reliability and interpretability of cognitive diagnostics in large-scale, open educational environments where logs are noisy and incomplete.

Abstract

Cognitive diagnostics in the Web-based Intelligent Education System (WIES) aims to assess students' mastery of knowledge concepts from heterogeneous, noisy interactions. Recent work has tried to utilize Large Language Models (LLMs) for cognitive diagnosis, yet LLMs struggle with structured data and are prone to noise-induced misjudgments. Specially, WIES's open environment continuously attracts new students and produces vast amounts of response logs, exacerbating the data imbalance and noise issues inherent in traditional educational systems. To address these challenges, we propose DLLM, a Diffusion-based LLM framework for noise-robust cognitive diagnosis. DLLM first constructs independent subgraphs based on response correctness, then applies relation augmentation alignment module to mitigate data imbalance. The two subgraph representations are then fused and aligned with LLM-derived, semantically augmented representations. Importantly, before each alignment step, DLLM employs a two-stage denoising diffusion module to eliminate intrinsic noise while assisting structural representation alignment. Specifically, unconditional denoising diffusion first removes erroneous information, followed by conditional denoising diffusion based on graph-guided to eliminate misleading information. Finally, the noise-robust representation that integrates semantic knowledge and structural information is fed into existing cognitive diagnosis models for prediction. Experimental results on three publicly available web-based educational platform datasets demonstrate that our DLLM achieves optimal predictive performance across varying noise levels, which demonstrates that DLLM achieves noise robustness while effectively leveraging semantic knowledge from LLM.

Harnessing LLM for Noise-Robust Cognitive Diagnosis in Web-Based Intelligent Education Systems

TL;DR

The paper tackles noise and data-imbalance challenges in cognitive diagnostics for Web-based Intelligent Education Systems by proposing DLLM, a diffusion-based LLM framework. DLLM integrates three components: (i) Relation Augmentation Alignment to address data imbalance via augmented subgraphs and contrastive learning, (ii) Semantic Augmentation Alignment to inject LLM-derived semantic knowledge through exercisedescriptions and student profiles, and (iii) a two-stage diffusion module (unconditional and graph-conditioned) to denoise representations before alignment. The approach yields noise-robust, semantically informed student and exercise embeddings that feed into existing CDMs, achieving state-of-the-art results on three public web-education datasets under varying noise levels. Practically, DLLM enhances reliability and interpretability of cognitive diagnostics in large-scale, open educational environments where logs are noisy and incomplete.

Abstract

Cognitive diagnostics in the Web-based Intelligent Education System (WIES) aims to assess students' mastery of knowledge concepts from heterogeneous, noisy interactions. Recent work has tried to utilize Large Language Models (LLMs) for cognitive diagnosis, yet LLMs struggle with structured data and are prone to noise-induced misjudgments. Specially, WIES's open environment continuously attracts new students and produces vast amounts of response logs, exacerbating the data imbalance and noise issues inherent in traditional educational systems. To address these challenges, we propose DLLM, a Diffusion-based LLM framework for noise-robust cognitive diagnosis. DLLM first constructs independent subgraphs based on response correctness, then applies relation augmentation alignment module to mitigate data imbalance. The two subgraph representations are then fused and aligned with LLM-derived, semantically augmented representations. Importantly, before each alignment step, DLLM employs a two-stage denoising diffusion module to eliminate intrinsic noise while assisting structural representation alignment. Specifically, unconditional denoising diffusion first removes erroneous information, followed by conditional denoising diffusion based on graph-guided to eliminate misleading information. Finally, the noise-robust representation that integrates semantic knowledge and structural information is fed into existing cognitive diagnosis models for prediction. Experimental results on three publicly available web-based educational platform datasets demonstrate that our DLLM achieves optimal predictive performance across varying noise levels, which demonstrates that DLLM achieves noise robustness while effectively leveraging semantic knowledge from LLM.

Paper Structure

This paper contains 38 sections, 26 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Student profile #3 on the Assist0910 dataset was generated by LLM based on the original response log and the response log with 15% added noise. The strengths have been transformed into weaknesses due to noise.
  • Figure 2: The proposed DLLM framework. In the figure, CDMs refers to any existing cognitive diagnostic models, RAA refers to the Relation Augmentation Alignment module, $\text{DDM}_u$ and $\text{DDM}_c$ refer to the unconditional and conditional Denoising Diffusion Model.
  • Figure 3: Noise tests on three datasets under different noise conditions.
  • Figure 4: Ablation experiments on three datasets under different noise conditions.
  • Figure 5: Hyperparameter analysis on two datasets.
  • ...and 1 more figures