P-MIA: A Profiled-Based Membership Inference Attack on Cognitive Diagnosis Models
Mingliang Hou, Yinuo Wang, Teng Guo, Zitao Liu, Wenzhou Dou, Jiaqi Zheng, Renqiang Luo, Mi Tian, Weiqi Luo
TL;DR
This work addresses the privacy risks of cognitive diagnosis models (CDMs) by introducing P-MIA, a grey-box membership inference attack that leverages the exposed knowledge state vector $kstate\_emb$ in learner profiles. By reversing radar-chart visuals or API-exposed states into $kstate\_emb$, P-MIA substantially strengthens member-vs-nonmember discrimination compared to black-box baselines, as shown across three real-world datasets and multiple CDMs. The paper also develops two attack architectures, demonstrates ablations showing $kstate\_emb$ as the dominant leakage source, and applies P-MIA to audit machine unlearning defenses, revealing that current approximate unlearning methods often fail to sufficiently mitigate leakage. Overall, the results highlight a critical trade-off between explainability and privacy in educational AI and motivate domain-specific defenses to protect sensitive student data without sacrificing transparency.
Abstract
Cognitive diagnosis models (CDMs) are pivotal for creating fine-grained learner profiles in modern intelligent education platforms. However, these models are trained on sensitive student data, raising significant privacy concerns. While membership inference attacks (MIA) have been studied in various domains, their application to CDMs remains a critical research gap, leaving their privacy risks unquantified. This paper is the first to systematically investigate MIA against CDMs. We introduce a novel and realistic grey box threat model that exploits the explainability features of these platforms, where a model's internal knowledge state vectors are exposed to users through visualizations such as radar charts. We demonstrate that these vectors can be accurately reverse-engineered from such visualizations, creating a potent attack surface. Based on this threat model, we propose a profile-based MIA (P-MIA) framework that leverages both the model's final prediction probabilities and the exposed internal knowledge state vectors as features. Extensive experiments on three real-world datasets against mainstream CDMs show that our grey-box attack significantly outperforms standard black-box baselines. Furthermore, we showcase the utility of P-MIA as an auditing tool by successfully evaluating the efficacy of machine unlearning techniques and revealing their limitations.
