Table of Contents
Fetching ...

A spatial scan statistical for categorical, functional data

Camille Frévent, Moustapha Sarr, Sophie Dabo-Niang

Abstract

We have developed and tested a spatial scan statistic for categorical, functional data (CFSS) - a data structure within which current approaches cannot identify spatial clusters. Our methodology combines an encoding scheme for categorical, functional observations with a nonparametric scan statistic. In a simulation study with three distinct scenarios, the CFSS accurately recovered the simulated spatial clusters and gave very low false positive rates, high true positive rates, and high positive predictive values. We have also used the CFSS to identify and characterize spatial clusters in French air pollution data from the winter of 2024.

A spatial scan statistical for categorical, functional data

Abstract

We have developed and tested a spatial scan statistic for categorical, functional data (CFSS) - a data structure within which current approaches cannot identify spatial clusters. Our methodology combines an encoding scheme for categorical, functional observations with a nonparametric scan statistic. In a simulation study with three distinct scenarios, the CFSS accurately recovered the simulated spatial clusters and gave very low false positive rates, high true positive rates, and high positive predictive values. We have also used the CFSS to identify and characterize spatial clusters in French air pollution data from the winter of 2024.
Paper Structure (15 sections, 5 theorems, 36 equations, 7 figures, 1 table)

This paper contains 15 sections, 5 theorems, 36 equations, 7 figures, 1 table.

Key Result

Proposition 1

$E_t$ is a self-adjoint, idempotent operator.

Figures (7)

  • Figure 1: An illustration of three types of clusters in categorical, functional data. Inside a type (i) cluster, the individuals' sequence of states differs from that outside the cluster; here, the first state differs, and the individuals move directly from state 1 to state 3 without passing through state 2. In the type (ii) cluster, individuals change state more (or less) rapidly than individuals outside the cluster. In the type (iii) cluster, the individuals spend more (or less) time in certain states than individuals outside the cluster.
  • Figure 2: Simulation of the CFSS's ability to detect a spatial cluster in categorical, functional data. For each scenario, the power curves, the true positive rates, the false positive rates, and the positive predictive values for detection of the spatial cluster as the MLC are shown. $\alpha$ and $\gamma$ are the parameters that control the cluster intensity.
  • Figure 3: Map of the monitoring stations and daily mean $\text{PM}_{10}$ concentrations curves between January $1^\text{st}$ and March $31^\text{st}$, 2024 (left panel) and the associated categorical trajectories (right panel).
  • Figure 4: The MLC (in red) detected with the CFSS for the daily mean $\text{PM}_{10}$ concentrations (categorized using the Atmo index).
  • Figure 5: Trajectories inside and outside the MLC detected with the CFSS for the daily mean $\text{PM}_{10}$ concentrations (categorized using the Atmo index).
  • ...and 2 more figures

Theorems & Definitions (10)

  • Proposition 1
  • Proposition 2: saporta1981methodes p107
  • Proposition 3: preda2021categorical
  • Proposition 4: saporta1981methodes
  • Proposition 5: saporta1981methodes
  • proof
  • proof
  • proof
  • proof
  • proof