Random pairing MLE for estimation of item parameters in Rasch model

Yuepeng Yang; Cong Ma

Random pairing MLE for estimation of item parameters in Rasch model

Yuepeng Yang, Cong Ma

TL;DR

The paper introduces RP-MLE and its bootstrapped variant MRPMLE to estimate Rasch item parameters from sparse binary responses by random pairing responses into item-item comparisons, effectively reducing the data to an item-only BT-like model with independent edges. It establishes finite-sample $\ell_{\infty}$ guarantees that are minimax-optimal, provides non-asymptotic distributional expansions for uncertainty quantification, and demonstrates exact top-$K$ recovery under sparse sampling. The work also derives asymptotic normality results linking MRPMLE to a weighted pseudo MLE, quantifies covariance shifts with multiple data splits, and develops confidence-interval procedures validated via simulations and a real LSAT dataset. Overall, RP-MLE/MRPMLE offer statistically rigorous, scalable, and uncertainty-aware estimation for item parameters in Rasch models under sparse data conditions, with practical applicability to educational testing and human-annotated data.

Abstract

The Rasch model, a classical model in the item response theory, is widely used in psychometrics to model the relationship between individuals' latent traits and their binary responses to assessments or questionnaires. In this paper, we introduce a new likelihood-based estimator -- random pairing maximum likelihood estimator ($\mathrm{RP\text{-}MLE}$) and its bootstrapped variant multiple random pairing MLE ($\mathrm{MRP\text{-}MLE}$) which faithfully estimate the item parameters in the Rasch model. The new estimators have several appealing features compared to existing ones. First, both work for sparse observations, an increasingly important scenario in the big data era. Second, both estimators are provably minimax optimal in terms of finite sample $\ell_{\infty}$ estimation error. Lastly, both admit precise distributional characterization that allows uncertainty quantification on the item parameters, e.g., construction of confidence intervals for the item parameters. The main idea underlying $\mathrm{RP\text{-}MLE}$ and $\mathrm{MRP\text{-}MLE}$ is to randomly pair user-item responses to form item-item comparisons. This is carefully designed to reduce the problem size while retaining statistical independence. We also provide empirical evidence of the efficacy of the two new estimators using both simulated and real data.

Random pairing MLE for estimation of item parameters in Rasch model

TL;DR

guarantees that are minimax-optimal, provides non-asymptotic distributional expansions for uncertainty quantification, and demonstrates exact top-

recovery under sparse sampling. The work also derives asymptotic normality results linking MRPMLE to a weighted pseudo MLE, quantifies covariance shifts with multiple data splits, and develops confidence-interval procedures validated via simulations and a real LSAT dataset. Overall, RP-MLE/MRPMLE offer statistically rigorous, scalable, and uncertainty-aware estimation for item parameters in Rasch models under sparse data conditions, with practical applicability to educational testing and human-annotated data.

Abstract

) and its bootstrapped variant multiple random pairing MLE (

) which faithfully estimate the item parameters in the Rasch model. The new estimators have several appealing features compared to existing ones. First, both work for sparse observations, an increasingly important scenario in the big data era. Second, both estimators are provably minimax optimal in terms of finite sample

estimation error. Lastly, both admit precise distributional characterization that allows uncertainty quantification on the item parameters, e.g., construction of confidence intervals for the item parameters. The main idea underlying

and

is to randomly pair user-item responses to form item-item comparisons. This is carefully designed to reduce the problem size while retaining statistical independence. We also provide empirical evidence of the efficacy of the two new estimators using both simulated and real data.

Paper Structure (70 sections, 22 theorems, 218 equations, 8 figures, 1 table, 2 algorithms)

This paper contains 70 sections, 22 theorems, 218 equations, 8 figures, 1 table, 2 algorithms.

Introduction
Main contributions
Prior art
Item response theory.
Latent score estimation for Rasch model.
The Bradley-Terry-Luce model with sparse comparisons.
Notation.
Problem setup and new estimators
Problem setup
Random pairing maximum likelihood estimator
Random pairing to construct item-item comparisons.
A variant via bootstrapping.
Main results
$\ell_{\infty}$ error bounds and top-$K$ recovery
Finite sample minimax optimality.
...and 55 more sections

Key Result

Theorem 1

Suppose that $mp\ge2$ and $np\ge C_{1}\kappa_{1}^{4}\kappa_{2}^{5}\log^{3}(n)$ for some sufficiently large constant $C_{1}>0$. Suppose that there exists some constant $\alpha>0$ such that $m\le n^{\alpha}$. Let $\widehat{\bm{\theta}}$ be the $\mathrm{RP\text{-}MLE}$ estimator. With probability at le Consequently, the estimator is able to exactly recover the top-$K$ items as soon as Here $C_{2},C_

Figures (8)

Figure 1: Estimation error $\|\widehat{\bm{\theta}}-\bm{\theta}^{\star}\|_{\infty}$ of $\mathrm{RP\text{-}MLE}$ with varying $n$ and $p$. Each point represents the average of 1000 trials.
Figure 2: Estimation error of $\mathrm{MRP\text{-}MLE}$ with varying number of data splittings. For each trial, we record $\|\frac{1}{k}\sum_{i=1}^{k}\widehat{\bm{\theta}}_{(i)}-\bm{\theta}^{\star}\|$ for $k=1,\ldots,100$. The parameters are chosen to be $m=50,p=0.2,n=10000$. The latent scores are all 0 and the each user is assigned with $mp$ item uniformly-at-random. Each point is averaged over 1000 trials.
Figure 3: $\|\widehat{\bm{\theta}}-\bm{\theta}^{\star}\|_{\infty}$ v.s. $n$ using Spectral method, PMLE, $\mathrm{RP\text{-}MLE}$, and $\mathrm{MRP\text{-}MLE}$ using 20 data splittings. The parameter is chosen to be $m=50,p=0.1$ and $n$ varies from $5000$ to $20000$. The result is averaged over 1000 trials.
Figure 4: $\|\widehat{\bm{\theta}}-\bm{\theta}^{\star}\|_{\infty}$ v.s. $\log(\kappa)$ using Spectral method, PMLE, $\mathrm{RP\text{-}MLE}$, and $\mathrm{MRP\text{-}MLE}$ using 20 data splittings. The parameter is chosen to be $m=50,p=0.1,n=20000$ and $\kappa$ varies from $1$ to $e^{10}$. The result is averaged over 1000 trials.
Figure 5: Top-$K$ recovery rate using spectral method, PMLE, $\mathrm{RP\text{-}MLE}$ and $\mathrm{MRP\text{-}MLE}$ using 20 data splittings. The parameter is chosen to be $m=10000,m=50,p=0.1,K=5$ and $\Delta_{K}$ varies from 0.1 to 0.7. The result is averaged over 1000 trials.
...and 3 more figures

Theorems & Definitions (26)

Theorem 1
Proposition 1: Minimax lower bound, Theorems 3.3 and 3.4 in nguyen2023optimal
Proposition 2: Informal, Theorem 3.1 in nguyen2023optimal
Theorem 2
Proposition 3
Proposition 4
Theorem 3
Theorem 4
Proposition 5
Lemma 1: Theorem 3 in Yang2024
...and 16 more

Random pairing MLE for estimation of item parameters in Rasch model

TL;DR

Abstract

Random pairing MLE for estimation of item parameters in Rasch model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (26)