Attenuation Bias with Latent Predictors
Connor T. Jerzak, Stephen A. Jessee
TL;DR
This paper analyzes attenuation bias when regressors are latent traits estimated from indicators. It shows that conventional corrections (e.g., IV, MOC) can misadjust for latent-predictor error due to identification-based rescaling, and introduces a modular correlation-corrected estimator based on split indicators that yields consistent slopes under standard assumptions. The method uses two independent latent-trait estimates, derives a correction factor from their correlation, and can be applied with any latent-trait estimator, including additive scores, factor models, or ML approaches. Through theory, simulations, and empirical applications (e.g., political knowledge predicting duty to vote), the authors demonstrate substantial improvements in bias and often results close to full joint estimation, with open-source software provided for implementation. The work highlights the need to tailor error correction to latent predictors, offering a practical, scalable tool for robust inference in political science and related fields.
Abstract
Many core concepts in political science are latent and therefore can only be measured with error. Measurement error in a predictor attenuates slope coefficient estimates in regression, biasing them toward zero. We show that widely used strategies for correcting attenuation bias -- including instrumental variables and the method of composition -- are themselves biased when applied to latent regressors, sometimes even more than simple regression ignoring the measurement error altogether. We derive a correlation-based correction using split-sample measurement strategies. Rather than assuming a particular estimation strategy for the latent trait, our approach is modular and can be easily deployed with a wide variety of latent trait measurement strategies, including additive score, factor, or machine learning models, requiring no joint estimation while yielding consistent slopes under standard assumptions. Simulations and applications show stronger relationships after our correction, sometimes by as much as 50%. Open-source software implements the procedure. Results underscore that latent predictors demand tailored error correction; otherwise, conventional practice can exacerbate bias.
