Table of Contents
Fetching ...

Prior Smoothing for Multivariate Disease Mapping Models

Garazi Retegui, María Dolores Ugarte, Jaione Etxeberria, Alan E. Gelfand

TL;DR

The paper tackles how spatial priors affect smoothing in multivariate areal disease mapping and provides a principled way to quantify it. It introduces a theoretical multivariate total conditional variance (TCV) and practical empirical smoothing metrics to compare priors across diseases and spatial resolutions, within a unified M-model framework. Through simulations with GP-based rate surfaces and real data from Spain, the authors show that iCAR induces the most smoothing while LCAR and L$_j$CAR induce less, with smoothing intensifying as the number of areas grows and disease rates diverge. The work offers actionable guidance for prior choice and emphasizes area-specific smoothing diagnostics to support transparent interpretation in public health decision-making.

Abstract

To date, we have seen the emergence of a large literature on multivariate disease mapping. That is, incidence of (or mortality from) multiple diseases is recorded at the scale of areal units where incidence (mortality) across the diseases is expected to manifest dependence. The modeling involves a hierarchical structure: a Poisson model for disease counts (conditioning on the rates) at the first stage, and a specification of a function of the rates using spatial random effects at the second stage. These random effects are specified as a prior and introduce spatial smoothing to the rate (or risk) estimates. What we see in the literature is the amount of smoothing induced under a given prior across areal units compared with the observed/empirical risks. Our contribution here extends previous research on smoothing in univariate areal data models. Specifically, for three different choices of multivariate prior, we investigate both within prior smoothing according to hyperparameters and across prior smoothing. Its benefit to the user is to illuminate the expected nature of departure from perfect fit associated with these priors since model performance is not a question of goodness of fit. We propose both theoretical and empirical metrics for our investigation and illustrate with both simulated and real data.

Prior Smoothing for Multivariate Disease Mapping Models

TL;DR

The paper tackles how spatial priors affect smoothing in multivariate areal disease mapping and provides a principled way to quantify it. It introduces a theoretical multivariate total conditional variance (TCV) and practical empirical smoothing metrics to compare priors across diseases and spatial resolutions, within a unified M-model framework. Through simulations with GP-based rate surfaces and real data from Spain, the authors show that iCAR induces the most smoothing while LCAR and LCAR induce less, with smoothing intensifying as the number of areas grows and disease rates diverge. The work offers actionable guidance for prior choice and emphasizes area-specific smoothing diagnostics to support transparent interpretation in public health decision-making.

Abstract

To date, we have seen the emergence of a large literature on multivariate disease mapping. That is, incidence of (or mortality from) multiple diseases is recorded at the scale of areal units where incidence (mortality) across the diseases is expected to manifest dependence. The modeling involves a hierarchical structure: a Poisson model for disease counts (conditioning on the rates) at the first stage, and a specification of a function of the rates using spatial random effects at the second stage. These random effects are specified as a prior and introduce spatial smoothing to the rate (or risk) estimates. What we see in the literature is the amount of smoothing induced under a given prior across areal units compared with the observed/empirical risks. Our contribution here extends previous research on smoothing in univariate areal data models. Specifically, for three different choices of multivariate prior, we investigate both within prior smoothing according to hyperparameters and across prior smoothing. Its benefit to the user is to illuminate the expected nature of departure from perfect fit associated with these priors since model performance is not a question of goodness of fit. We propose both theoretical and empirical metrics for our investigation and illustrate with both simulated and real data.
Paper Structure (14 sections, 8 equations, 6 figures, 12 tables)

This paper contains 14 sections, 8 equations, 6 figures, 12 tables.

Figures (6)

  • Figure 1: Total RMSS values, representing the combined smoothing of both diseases, for $\Sigma_{11}=0.04$ and varying $\Sigma_{22}$ across the four scenarios and different correlation levels.
  • Figure 2: Area-specific components of the RMSS for each geographical unit $i$ for colon, pancreas and stomach cancer in continental Spain, shown for the 47 real provinces ($G = 47$, top panels) and for the disaggregation with $G=300$ (bottom panels).The legend reports the 50th, 75th, 85th, 90th, 95th, and 97th percentiles.
  • Figure 3: Total RMSS values, representing the combined smoothing of both diseases, for $\lambda=0.2$, $\Sigma_{11}=0.04$ and varying $\Sigma_{22}$ across the four scenarios and different $\rho$ values.
  • Figure 4: Total RMSS values, representing the combined smoothing of both diseases, for $\lambda=0.2$, $\Sigma_{11}=0.04$ and varying $\Sigma_{22}$ across the four scenarios and different $\rho$ values.
  • Figure 5: Crude mortality rates per 100,000 inhabitants for colon, pancreatic and stomach cancer in continental Spain, shown for the 47 real provinces ($G = 47$, top panels) and for the disaggregation with $G=300$ (bottom panels).
  • ...and 1 more figures