Table of Contents
Fetching ...

Robust variable selection for spatial point processes observed with noise

Dominik Sturm, Ivo F. Sbalzarini

TL;DR

Addresses robust variable selection in the intensity of spatial point processes observed with noise, where measurement and detection errors distort covariate effects. The authors combine an adaptive $L_0$ (best-subset) penalty with stability selection based on $p$-thinning subsampling to control the per-family error rate and enhance selection stability, and they compare against adaptive $L_1$ penalties and composite information criteria. The framework relies on estimating-function inference for spatial PP, proximal-gradient optimization for non-convex penalties, and the Berman-Turner device to link to GLM software, evaluated in Poisson and Thomas process settings with localization and detection noise. A real forestry dataset demonstrates that the method yields sparse, interpretable covariate sets consistent with prior findings, without requiring explicit noise modeling.

Abstract

We propose a method for variable selection in the intensity function of spatial point processes that combines sparsity-promoting estimation with noise-robust model selection. As high-resolution spatial data becomes increasingly available through remote sensing and automated image analysis, identifying spatial covariates that influence the localization of events is crucial to understand the underlying mechanism. However, results from automated acquisition techniques are often noisy, for example due to measurement uncertainties or detection errors, which leads to spurious displacements and missed events. We study the impact of such noise on sparse point-process estimation across different models, including Poisson and Thomas processes. To improve noise robustness, we propose to use stability selection based on point-process subsampling and to incorporate a non-convex best-subset penalty to enhance model-selection performance. In extensive simulations, we demonstrate that such an approach reliably recovers true covariates under diverse noise scenarios and improves both selection accuracy and stability. We then apply the proposed method to a forestry data set, analyzing the distribution of trees in relation to elevation and soil nutrients in a tropical rain forest. This shows the practical utility of the method, which provides a systematic framework for robust variable selection in spatial point-process models under noise, without requiring additional knowledge of the process.

Robust variable selection for spatial point processes observed with noise

TL;DR

Addresses robust variable selection in the intensity of spatial point processes observed with noise, where measurement and detection errors distort covariate effects. The authors combine an adaptive (best-subset) penalty with stability selection based on -thinning subsampling to control the per-family error rate and enhance selection stability, and they compare against adaptive penalties and composite information criteria. The framework relies on estimating-function inference for spatial PP, proximal-gradient optimization for non-convex penalties, and the Berman-Turner device to link to GLM software, evaluated in Poisson and Thomas process settings with localization and detection noise. A real forestry dataset demonstrates that the method yields sparse, interpretable covariate sets consistent with prior findings, without requiring explicit noise modeling.

Abstract

We propose a method for variable selection in the intensity function of spatial point processes that combines sparsity-promoting estimation with noise-robust model selection. As high-resolution spatial data becomes increasingly available through remote sensing and automated image analysis, identifying spatial covariates that influence the localization of events is crucial to understand the underlying mechanism. However, results from automated acquisition techniques are often noisy, for example due to measurement uncertainties or detection errors, which leads to spurious displacements and missed events. We study the impact of such noise on sparse point-process estimation across different models, including Poisson and Thomas processes. To improve noise robustness, we propose to use stability selection based on point-process subsampling and to incorporate a non-convex best-subset penalty to enhance model-selection performance. In extensive simulations, we demonstrate that such an approach reliably recovers true covariates under diverse noise scenarios and improves both selection accuracy and stability. We then apply the proposed method to a forestry data set, analyzing the distribution of trees in relation to elevation and soil nutrients in a tropical rain forest. This shows the practical utility of the method, which provides a systematic framework for robust variable selection in spatial point-process models under noise, without requiring additional knowledge of the process.

Paper Structure

This paper contains 9 sections, 24 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: Illustration of the simulation setup and the noise types considered here for a Poisson point process with $\mathbb{E}N(W)=150$ and parameters $\beta=(1, 0.5)^\top$. The top row shows samples with different levels (from left to right: $c=0,2,4$) of localization uncertainty (scenario P1). The bottom row shows the same with detection uncertainty (scenario P2). Green points indicate the true simulated point locations; orange crosses show the locations observed with noise. The blue shades visualize the intensity function of the underlying point process (color bar).
  • Figure 2: Regularization paths ($\lambda\in[10^{-4}, 5\times 10^{2}]$) for a Poisson process with parameters $\beta=(1, 0.5)^\top$ observed with localization uncertainty (scenario P1, $\mathbb{E}N(W)=200$, $c=4$). The penalty ($L_0$, $L_1$) is indicated by the row labels on the left. The left panels show the coefficient paths $\beta_j^\lambda$ with the coefficients of the true covariates as symbol lines (ground truth values indicated by dotted lines) and noise covariates as dashed lines. The right panels show the corresponding stability paths $\Pi_j^\lambda$ with the dotted horizontal line indicating the threshold $\pi_{\mathrm{th}}=0.9$ corresponding to $\mathrm{PFER}\leq 1$.
  • Figure 3: Variable-selection performance for a Poisson point process with localization uncertainty (scenario P1). We show the mean (over 100 independent repetitions of each experiment) True Positive Rate (TPR), False Positive Rate (FPR), Positive Predictive Value (PPV), $F_1$ score, and feature-selection stability $\Phi_S$ for model selection using the BIC (A), ERIC (B), and stability selection with $\mathrm{PFER}\leq 1$(C) with adaptive $L_0$ (top row of each subfigure) and adaptive $L_1$ (bottom row of each subfigure) penalties. Each panel shows a performance metric (top titles, color bars) for different noise magnitudes $c$ ($x$-axis) and sample sizes $\mathbb{E}N(W)$ ($y$-axis). The average metrics over all 25 experiments are given in the panel titles with arrows ($\uparrow/\downarrow$) indicating the direction of improvement.
  • Figure 4: Empirical confirmation that stability selection achieves the desired error bound $\mathrm{PFER}\leq 1$ for all experiments in scenario P1.
  • Figure 5: Variable-selection performance for a Poisson point process with detection uncertainty (scenario P2). We show the mean (over 100 independent repetitions of each experiment) True Positive Rate (TPR), False Positive Rate (FPR), Positive Predictive Value (PPV), $F_1$ score, and feature-selection stability $\Phi_S$ for model selection using the BIC (A), ERIC (B), and stability selection with $\mathrm{PFER}\leq 1$(C) with adaptive $L_0$ (top row of each subfigure) and adaptive $L_1$ (bottom row of each subfigure) penalties. Each panel shows a performance metric (top titles, color bars) for different noise magnitudes $c$ ($x$-axis) and sample sizes $\mathbb{E}N(W)$ ($y$-axis). The average metrics over all 25 experiments are given in the panel titles with arrows ($\uparrow/\downarrow$) indicating the direction of improvement.
  • ...and 7 more figures