Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data

Huan Qing

Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data

Huan Qing

Abstract

Ordinal categorical data are widely collected in psychology, education, and other social sciences, appearing commonly in questionnaires, assessments, and surveys. Latent class models provide a flexible framework for uncovering unobserved heterogeneity by grouping individuals into homogeneous classes based on their response patterns. A fundamental challenge in applying these models is determining the number of latent classes, which is unknown and must be inferred from data. In this paper, we propose one test statistic for this problem. The test statistic centers the largest singular value of a normalized residual matrix by a simple sample-size adjustment. Under the null hypothesis that the candidate number of latent classes is correct, its upper bound converges to zero in probability. Under an under-fitted alternative, the statistic itself exceeds a fixed positive constant with probability approaching one. This sharp dichotomous behavior of the test statistic yields two sequential testing algorithms that consistently estimate the true number of latent classes. Extensive experimental studies confirm the theoretical findings and demonstrate their accuracy and reliability in determining the number of latent classes.

Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data

Abstract

Paper Structure (48 sections, 16 theorems, 277 equations, 3 figures, 4 tables, 3 algorithms)

This paper contains 48 sections, 16 theorems, 277 equations, 3 figures, 4 tables, 3 algorithms.

Introduction
Model and problem
Latent class model
Problem statement
Test statistic
Ideal normalized residual matrix
Practical test statistic
Algorithms
GoF-LCM algorithm
RGoF-LCM algorithm
Numerical Studies
General simulation setup
Class membership matrix $Z$
Item parameter matrix $\Theta$
Response matrix $R$
...and 33 more sections

Key Result

Lemma 1

When Assumption ass:A1 holds, for any $\epsilon > 0$, we have

Figures (3)

Figure 1: Accuracy (left) and running time (right) of GoF-LCM and RGoF-LCM for $K=8$, $J=60$, $\delta=0.3$, with varying $N$.
Figure 2: Accuracy of GoF‑LCM and RGoF‑LCM $K=8$, $N=600$, $\delta=0.3$, with varying $J$.
Figure 3: Test statistic $T_{K_0}$ (left) and ratio $r_{K_0}=|T(K_0-1)/T(K_0)|$ (right) versus candidate number of latent classes $K_0$ for the BFPT dataset.

Theorems & Definitions (36)

Definition 1: Latent class model for ordinal categorical data
Lemma 1: Spectral norm of ideal residual matrix
Remark 1
Lemma 2: Perturbation control of normalized residual matrix
Theorem 1: Null behavior of test statistic
Theorem 2: Alternative behavior of the test statistic
Theorem 3: Consistency of GoF–LCM
Remark 2: Choice of $\tau_N$
Theorem 4: Asymptotic behaviour of the ratio statistic
Theorem 5: Consistency of RGoF-LCM
...and 26 more

Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data

Abstract

Goodness-of-Fit Tests for Latent Class Models with Ordinal Categorical Data

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (36)