On the Identifiability of Diagnostic Classification Models
Guanhua Fang, Jingchen Liu, Zhiliang Ying
TL;DR
The paper develops a general identifiability theory for diagnostic classification models within a latent class framework, addressing identifiability of item parameters, attribute distributions, and the item-specific partial information structure encoded by the Q-matrix. It leverages Kruskal-type arguments and T-matrix rank conditions to derive sufficient identifiability results and introduces a Bayesian estimation framework with a Dirichlet process–style stick-breaking prior to recover an unbounded latent class structure; partial information is inferred via clustering of item-response probabilities. Through extensive simulations across NIDA, NC-RUM, and LCDM settings and a real NESARC Social Phobia analysis, the authors demonstrate consistent estimation and successful reconstruction of qualitative item-attribute relationships, supporting reliable diagnostic inference. The work provides a unified, model-parameter-agnostic foundation for identifiability in DCMs and offers practical estimation tools for uncovering the underlying attribute structure without pre-specifying the number of latent classes.
Abstract
This paper establishes fundamental results for statistical inference of diagnostic classification models (DCM). The results are developed at a high level of generality, applicable to essentially all diagnostic classification models. In particular, we establish identifiability results of various modeling parameters, notably item response probabilities, attribute distribution, and Q-matrix-induced partial information structure. Consistent estimators are constructed. Simulation results show that these estimators perform well under various modeling settings. We also use a real example to illustrate the new method. The results are stated under the setting of general latent class models. For DCM with a specific parameterization, the conditions may be adapted accordingly.
