Semiparametric Causal Inference for Right-Censored Outcomes with Many Weak Invalid Instruments
Qiushi Bu, Wen Su, Xingqiu Zhao, Zhonghua Liu
TL;DR
This work develops MAWII-Surv, a semiparametric framework for causal inference with right-censored outcomes and many weak invalid instruments by leveraging heteroscedasticity-based identification under a structural AFT model. It introduces GEL2.0, a generalized empirical likelihood method that handles non-Neyman orthogonal nuisances and a diverging set of nuisance functions estimated with deep neural networks, while explicitly accounting for the variance inflation from non-orthogonal censoring weights. A local Kaplan–Meier approach estimates the non-Neyman orthogonal nuisance, and a censoring-adjusted over-identification test extends classical diagnostics to censored data. Theoretical results show consistency and asymptotic normality with a novel variance decomposition, and empirical validation via simulations and UK Biobank data demonstrates robustness, practical utility, and the ability to uncover relationships masked by unmeasured confounding. Overall, the framework advances robust MR for censored survival, enabling reliable causal conclusions in large biobank studies.
Abstract
We propose a semiparametric framework for causal inference with right-censored survival outcomes and many weak invalid instruments, motivated by Mendelian randomization in biobank studies where classical methods may fail. We adopt an accelerated failure time model and construct a moment condition based on augmented inverse probability of censoring weighting, incorporating both uncensored and censored observations. Under a heteroscedasticity-based condition on the treatment model, we establish point identification of the causal effect despite censoring and invalid instruments. We propose GEL-NOW (Generalized Empirical Likelihood with Non-Neyman Orthogonal and Weak moments) for valid inference under these conditions. A divergent number of Neyman orthogonal nuisance functions is estimated using deep neural networks. A key challenge is that the conditional censoring distribution is a non-Neyman orthogonal nuisance, contributing to the first-order asymptotics of the estimator for the target causal effect parameter. We derive the asymptotic distribution and explicitly incorporate this additional uncertainty into the asymptotic variance formula. We also introduce a censoring-adjusted over-identification test that accounts for this new variance component. Simulation studies and UK Biobank applications demonstrate the method's robustness and practical utility.
