Table of Contents
Fetching ...

A reduced rank model for spatial categorical data with many classes

Paul B May, Andrew Simpson, Semhar Michael

Abstract

We develop an identifiable reduced-rank spatial multinomial model for categorical data with many classes. The model represents class-specific spatial effects through a low-dimensional set of shared latent factors, substantially reducing parameter dimension while preserving joint dependence across classes. Because standard conjugate and Pólya-Gamma methods fail under this factorization, we propose a Gibbs sampler using Laplace-approximation proposals within Metropolis-Hastings updates. Simulation studies examine dimension selection and the accuracy of the Laplace proposals. An application to dominant tree species mapping in the Blue Ridge Mountains demonstrates scalable inference and flexible joint predictions for individual classes, class unions, and area-level summaries.

A reduced rank model for spatial categorical data with many classes

Abstract

We develop an identifiable reduced-rank spatial multinomial model for categorical data with many classes. The model represents class-specific spatial effects through a low-dimensional set of shared latent factors, substantially reducing parameter dimension while preserving joint dependence across classes. Because standard conjugate and Pólya-Gamma methods fail under this factorization, we propose a Gibbs sampler using Laplace-approximation proposals within Metropolis-Hastings updates. Simulation studies examine dimension selection and the accuracy of the Laplace proposals. An application to dominant tree species mapping in the Blue Ridge Mountains demonstrates scalable inference and flexible joint predictions for individual classes, class unions, and area-level summaries.
Paper Structure (12 sections, 20 equations, 5 figures)

This paper contains 12 sections, 20 equations, 5 figures.

Figures (5)

  • Figure 1: The results of 100 simulations with $J=5$ and $u_\text{true}=2$. The true dimension of the latent effects is often not selected by WAIC and PSIS-LOO, but neither is the true dimension always superior for interpolating a single multivariate realization given limited observations. On average, using the true dimension delivers better predictive performance, but selecting a dimension via WAIC or PSIS-LOO is substantially better than defaulting to a full rank model.
  • Figure 2: Posterior expected values of the logits with varying marginal precisions, $\omega$, compared between two inference methods. The first method accepts or rejects Laplace approximation proposals of $\boldsymbol{w}_j|\cdots;j=1,\ldots,u$ through a Metropolis Hastings step, asymptotically sampling from the true posterior. The second method is a nested Laplace approximation, where the Laplace approximation is always accepted. Lower marginal precisions produce a more severe linear bias in the predictions of the nested Laplace approximation and lower acceptance rates in the Metropolis-Hastings step.
  • Figure 3: Observations of 24 unique dominant species classes across the Blue Ridge Mountains of North Carolina.
  • Figure 4: Iterations of the ternary search to minimize WAIC.
  • Figure 5: Example posterior predictions across the study area, demonstrating the ability to predict probabilities for single classes, unions of classes, and area summaries.