Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems
Junhao Shen, Hong Qian, Wei Zhang, Aimin Zhou
TL;DR
The paper tackles cognitive diagnosis by balancing interpretability and generalization, proposing Symbolic Cognitive Diagnosis (SCD) that models the student–exercise interaction with a symbolic regression tree and trains parameters with gradient methods. It adopts a hybrid alternating optimization: derivative-free genetic programming to evolve the symbolic interaction and Adam-based gradient updates for proficiency and exercise features, all under monotonicity constraints. Empirical results on four real-world datasets show that SCD achieves competitive generalization while delivering superior interpretability via explicit symbolic trees and the DOA metric, with ablation confirming the importance of both components. This approach offers a transparent, adaptable framework for cognitive diagnosis in intelligent education systems, with potential for broader educational applications.
Abstract
Cognitive diagnosis assessment is a fundamental and crucial task for student learning. It models the student-exercise interaction, and discovers the students' proficiency levels on each knowledge attribute. In real-world intelligent education systems, generalization and interpretability of cognitive diagnosis methods are of equal importance. However, most existing methods can hardly make the best of both worlds due to the complicated student-exercise interaction. To this end, this paper proposes a symbolic cognitive diagnosis~(SCD) framework to simultaneously enhance generalization and interpretability. The SCD framework incorporates the symbolic tree to explicably represent the complicated student-exercise interaction function, and utilizes gradient-based optimization methods to effectively learn the student and exercise parameters. Meanwhile, the accompanying challenge is that we need to tunnel the discrete symbolic representation and continuous parameter optimization. To address this challenge, we propose to hybridly optimize the representation and parameters in an alternating manner. To fulfill SCD, it alternately learns the symbolic tree by derivative-free genetic programming and learns the student and exercise parameters via gradient-based Adam. The extensive experimental results on various real-world datasets show the superiority of SCD on both generalization and interpretability. The ablation study verifies the efficacy of each ingredient in SCD, and the case study explicitly showcases how the interpretable ability of SCD works.
