Table of Contents
Fetching ...

Deep Computerized Adaptive Testing

Jiguang Li, Robert Gibbons, Veronika Rockova

Abstract

Computerized adaptive tests (CATs) play a crucial role in educational assessment and diagnostic screening in behavioral health. Unlike traditional linear tests that administer a fixed set of pre-assembled items, CATs adaptively tailor the test to an examinee's latent trait level by selecting a smaller subset of items based on their previous responses. Existing CAT frameworks predominantly rely on item response theory (IRT) models with a single latent variable, a choice driven by both conceptual simplicity and computational feasibility. However, many real-world item response datasets exhibit complex, multi-factor structures, limiting the applicability of CATs in broader settings. In this work, we develop a novel CAT system that incorporates multivariate latent traits, building on recent advances in Bayesian sparse multivariate IRT. Our approach leverages direct sampling from the latent factor posterior distributions, significantly accelerating existing information-theoretic item selection criteria by eliminating the need for computationally intensive Markov Chain Monte Carlo (MCMC) simulations. Recognizing the potential sub-optimality of existing item selection rules, which are often based on myopic one-step-lookahead optimization of some information-theoretic criterion, we propose a double deep Q-learning algorithm to learn an optimal item selection policy. Through simulation and real-data studies, we demonstrate that our approach not only accelerates existing item selection methods but also highlights the potential of reinforcement learning in CATs.

Deep Computerized Adaptive Testing

Abstract

Computerized adaptive tests (CATs) play a crucial role in educational assessment and diagnostic screening in behavioral health. Unlike traditional linear tests that administer a fixed set of pre-assembled items, CATs adaptively tailor the test to an examinee's latent trait level by selecting a smaller subset of items based on their previous responses. Existing CAT frameworks predominantly rely on item response theory (IRT) models with a single latent variable, a choice driven by both conceptual simplicity and computational feasibility. However, many real-world item response datasets exhibit complex, multi-factor structures, limiting the applicability of CATs in broader settings. In this work, we develop a novel CAT system that incorporates multivariate latent traits, building on recent advances in Bayesian sparse multivariate IRT. Our approach leverages direct sampling from the latent factor posterior distributions, significantly accelerating existing information-theoretic item selection criteria by eliminating the need for computationally intensive Markov Chain Monte Carlo (MCMC) simulations. Recognizing the potential sub-optimality of existing item selection rules, which are often based on myopic one-step-lookahead optimization of some information-theoretic criterion, we propose a double deep Q-learning algorithm to learn an optimal item selection policy. Through simulation and real-data studies, we demonstrate that our approach not only accelerates existing item selection methods but also highlights the potential of reinforcement learning in CATs.

Paper Structure

This paper contains 34 sections, 2 theorems, 39 equations, 16 figures, 6 tables, 1 algorithm.

Key Result

Theorem 4.2

Consider a K-factor CAT item selection procedure after selecting $T$ items, with $\mathcal{N}(\bm{0}_K, \mathbbm{I}_K)$ prior placed on the test taker's latent trait $\bm{\theta}$. If $\bm{Y}_{1:T}=\left(y_{j_1}, \ldots, y_{j_T}\right)'$ is conditionally independent binary response data from the two with posterior parameters where $\bm{C}_1 = \text{diag}(2y_{j_1}-1, \cdots, 2y_{j_T}-1) \bm{B}_{1:

Figures (16)

  • Figure 1: Estimated Bifactor Factor Loading Matrix for pCAT-COG
  • Figure 2: High-level architecture of the Q-network. The shared encoder $\phi_1$ maps each tuple of posterior parameters to $\mathbb{R}^{L_1}$ and the sum yields the permutation invariant representation $g_1(\tilde{\boldsymbol{\xi}}_t)$. The matrix $\bm{\Psi}_t$ is encoded by $\phi_2$. The concatenated vector in $\mathbb{R}^{L}$ is passed to the classifier $\rho$ to select the $j$-th item (largest value in the $J$ logits). This network is trained offline using Algorithm \ref{['alg:q-learning']}; during live CAT, its weights are fixed and only the posterior state is updated sequentially.
  • Figure 3: Number of Items Versus Cumulative Percentage of Completed Tests
  • Figure 4: Distributions of Item Exposure Rates
  • Figure 5: pCAT-COG: Primary Factor Posterior Variance Reduction (Left) and Estimation Accuracy (Right)
  • ...and 11 more figures

Theorems & Definitions (4)

  • Definition 4.1
  • Theorem 4.2
  • Theorem E.1
  • proof