Concept-Aware Latent and Explicit Knowledge Integration for Enhanced Cognitive Diagnosis
Yawen Chen, Jiande Sun, Jing Li, Huaxiang Zhang
TL;DR
This paper tackles cognitive diagnosis limitations from unidimensional mastery representations and sparse Q-matrices by proposing CLEKI-CD, which combines concept-aware multidimensional embeddings for students and exercises with a latent Q-matrix learned through attention over a concept dependency graph. A combined diagnostic layer fuses explicit knowledge from the Q-matrix with latent knowledge to robustly infer mastery and predict responses, using $y_{ij} = \epsilon u_{ij} + (1-\epsilon) \tilde{u}_{ij}$ and optimized via cross-entropy loss $\mathcal{L}$. Key innovations include the concept-aware embedding module, an attention-based knowledge aggregation (AGM) to generate a latent Q-matrix $\tilde{\mathbf{Q}}$, and an integration mechanism that improves performance and interpretability on real-world ITS datasets. Experiments on ASSIST and Junyi demonstrate superior predictive accuracy and robustness to sparse data, with ablations and case studies highlighting the contribution of each component and the model’s interpretability through latent concepts.
Abstract
Cognitive diagnosis can infer the students' mastery of specific knowledge concepts based on historical response logs. However, the existing cognitive diagnostic models (CDMs) represent students' proficiency via a unidimensional perspective, which can't assess the students' mastery on each knowledge concept comprehensively. Moreover, the Q-matrix binarizes the relationship between exercises and knowledge concepts, and it can't represent the latent relationship between exercises and knowledge concepts. Especially, when the granularity of knowledge attributes refines increasingly, the Q-matrix becomes incomplete correspondingly and the sparse binary representation (0/1) fails to capture the intricate relationships among knowledge concepts. To address these issues, we propose a Concept-aware Latent and Explicit Knowledge Integration model for cognitive diagnosis (CLEKI-CD). Specifically, a multidimensional vector is constructed according to the students' mastery and exercise difficulty for each knowledge concept from multiple perspectives, which enhances the representation capabilities of the model. Moreover, a latent Q-matrix is generated by our proposed attention-based knowledge aggregation method, and it can uncover the coverage degree of exercises over latent knowledge. The latent Q-matrix can supplement the sparse explicit Q-matrix with the inherent relationships among knowledge concepts, and mitigate the knowledge coverage problem. Furthermore, we employ a combined cognitive diagnosis layer to integrate both latent and explicit knowledge, further enhancing cognitive diagnosis performance. Extensive experiments on real-world datasets demonstrate that CLEKI-CD outperforms the state-of-the-art models. The proposed CLEKI-CD is promising in practical applications in the field of intelligent education, as it exhibits good interpretability with diagnostic results.
