Robust variable selection for spatial point processes observed with noise
Dominik Sturm, Ivo F. Sbalzarini
TL;DR
Addresses robust variable selection in the intensity of spatial point processes observed with noise, where measurement and detection errors distort covariate effects. The authors combine an adaptive $L_0$ (best-subset) penalty with stability selection based on $p$-thinning subsampling to control the per-family error rate and enhance selection stability, and they compare against adaptive $L_1$ penalties and composite information criteria. The framework relies on estimating-function inference for spatial PP, proximal-gradient optimization for non-convex penalties, and the Berman-Turner device to link to GLM software, evaluated in Poisson and Thomas process settings with localization and detection noise. A real forestry dataset demonstrates that the method yields sparse, interpretable covariate sets consistent with prior findings, without requiring explicit noise modeling.
Abstract
We propose a method for variable selection in the intensity function of spatial point processes that combines sparsity-promoting estimation with noise-robust model selection. As high-resolution spatial data becomes increasingly available through remote sensing and automated image analysis, identifying spatial covariates that influence the localization of events is crucial to understand the underlying mechanism. However, results from automated acquisition techniques are often noisy, for example due to measurement uncertainties or detection errors, which leads to spurious displacements and missed events. We study the impact of such noise on sparse point-process estimation across different models, including Poisson and Thomas processes. To improve noise robustness, we propose to use stability selection based on point-process subsampling and to incorporate a non-convex best-subset penalty to enhance model-selection performance. In extensive simulations, we demonstrate that such an approach reliably recovers true covariates under diverse noise scenarios and improves both selection accuracy and stability. We then apply the proposed method to a forestry data set, analyzing the distribution of trees in relation to elevation and soil nutrients in a tropical rain forest. This shows the practical utility of the method, which provides a systematic framework for robust variable selection in spatial point-process models under noise, without requiring additional knowledge of the process.
