Nonparametric Least Squares Estimators for Interval Censoring
Piet Groeneboom
TL;DR
The paper addresses the open problem of limit distributions for the nonparametric MLE in interval censoring with multiple observation times per subject (case 2, non-separated). It introduces and analyzes two isotonic nonparametric least-squares estimators, proves consistency, and derives a Brownian-motion–drift type limit for the main LS estimator, together with a parallel limit for a simpler one-step LS variant; it uses smooth functional theory to study asymptotic behavior of smooth functionals and quantitatively compares LS estimators to the MLE via simulations. The findings show a $n^{1/3}$ convergence rate at a fixed point with a specific Brownian-minimizer limit, and while the MLE’s conjectured faster rate remains unobserved for moderate samples, the LS estimator often exhibits smaller pointwise variance in practice; the work provides practical computational methods (iterative convex minorant) and a rigorous asymptotic framework for interval-censoring problems. Overall, the results offer a consistent, computable alternative to the MLE for interval censoring in the non-separated regime and lay out a detailed smooth-functional approach to their asymptotics, including a complete treatment for a simpler LS variant.
Abstract
The limit distribution of the nonparametric maximum likelihood estimator for interval censored data with more than one observation time per unobservable observation, is still unknown in general. For the so-called separated case, where one has observation times which are at a distance larger than a fixed positive epsilon, the limit distribution was derived in [5]. For the non-separated case there is a conjectured limit distribution, given in [10], Section 5.2 of Part 2. Whether this conjecture holds is still unknown, but the present paper shows that for sample sizes 1000 and 10,000 this limit behavior is still not clearly seen. We prove consistency of a related nonparametric isotonic least squares estimator and sketch of the proof for its limit distribution. We also provide simulation results to show how the nonparametric MLE and least squares estimator behave in comparison. Moreover, we discuss a simpler least squares estimator that can be computed in one step, but is inferior to the other least squares estimator, since it does not use all information. For the simplest model of interval censoring, the current status model, the nonparametric maximum likelihood and least squares estimators are the same. This equivalence breaks down if there are more observation times per unobservable observation. The computations for the simulation of the more complicated interval censoring model were performed by using the iterative convex minorant algorithm. They are provided in the GitHub repository [7].
