Random pairing MLE for estimation of item parameters in Rasch model
Yuepeng Yang, Cong Ma
TL;DR
The paper introduces RP-MLE and its bootstrapped variant MRPMLE to estimate Rasch item parameters from sparse binary responses by random pairing responses into item-item comparisons, effectively reducing the data to an item-only BT-like model with independent edges. It establishes finite-sample $\ell_{\infty}$ guarantees that are minimax-optimal, provides non-asymptotic distributional expansions for uncertainty quantification, and demonstrates exact top-$K$ recovery under sparse sampling. The work also derives asymptotic normality results linking MRPMLE to a weighted pseudo MLE, quantifies covariance shifts with multiple data splits, and develops confidence-interval procedures validated via simulations and a real LSAT dataset. Overall, RP-MLE/MRPMLE offer statistically rigorous, scalable, and uncertainty-aware estimation for item parameters in Rasch models under sparse data conditions, with practical applicability to educational testing and human-annotated data.
Abstract
The Rasch model, a classical model in the item response theory, is widely used in psychometrics to model the relationship between individuals' latent traits and their binary responses to assessments or questionnaires. In this paper, we introduce a new likelihood-based estimator -- random pairing maximum likelihood estimator ($\mathrm{RP\text{-}MLE}$) and its bootstrapped variant multiple random pairing MLE ($\mathrm{MRP\text{-}MLE}$) which faithfully estimate the item parameters in the Rasch model. The new estimators have several appealing features compared to existing ones. First, both work for sparse observations, an increasingly important scenario in the big data era. Second, both estimators are provably minimax optimal in terms of finite sample $\ell_{\infty}$ estimation error. Lastly, both admit precise distributional characterization that allows uncertainty quantification on the item parameters, e.g., construction of confidence intervals for the item parameters. The main idea underlying $\mathrm{RP\text{-}MLE}$ and $\mathrm{MRP\text{-}MLE}$ is to randomly pair user-item responses to form item-item comparisons. This is carefully designed to reduce the problem size while retaining statistical independence. We also provide empirical evidence of the efficacy of the two new estimators using both simulated and real data.
