Table of Contents
Fetching ...

Large Wave Direction Data Modeling Using Wrapped Spatial Gaussian Markov Random Fields

Arnab Hazra

Abstract

Statistical modeling of dependent directional data remains relatively underexplored, particularly in high-dimensional spatial settings. Existing approaches for spatial angular data primarily rely on wrapped Gaussian process (WGP) models, which provide a coherent framework for capturing spatial dependence on the circle. However, WGP-based methods become computationally challenging when the spatial domain is large, and observations are available at high resolution. This limitation is especially relevant in the analysis of large-scale geological and climate phenomena, such as tsunamis and hurricanes, where directional measurements (e.g., wave or wind directions) may be available over an entire ocean basin. To address these challenges, we propose a wrapped Gaussian Markov random field (WGMRF) model for large spatial directional datasets. By exploiting the sparse precision structure inherent in Gaussian Markov random fields, the proposed approach achieves substantial computational gains while preserving flexible spatial dependence on the circular scale. We discuss key properties of the model, including its identifiability and dependence characteristics. The model fitting involves standard Markov chain Monte Carlo techniques. Through extensive simulation studies and an application to the wave direction data across the Indian Ocean during the 2004 Indian Ocean Tsunami, we compare the proposed method with both a non-spatial wrapped Gaussian model and a low-rank WGP alternative. The results demonstrate that the WGMRF offers improved predictive performance and scalability in large-domain applications.

Large Wave Direction Data Modeling Using Wrapped Spatial Gaussian Markov Random Fields

Abstract

Statistical modeling of dependent directional data remains relatively underexplored, particularly in high-dimensional spatial settings. Existing approaches for spatial angular data primarily rely on wrapped Gaussian process (WGP) models, which provide a coherent framework for capturing spatial dependence on the circle. However, WGP-based methods become computationally challenging when the spatial domain is large, and observations are available at high resolution. This limitation is especially relevant in the analysis of large-scale geological and climate phenomena, such as tsunamis and hurricanes, where directional measurements (e.g., wave or wind directions) may be available over an entire ocean basin. To address these challenges, we propose a wrapped Gaussian Markov random field (WGMRF) model for large spatial directional datasets. By exploiting the sparse precision structure inherent in Gaussian Markov random fields, the proposed approach achieves substantial computational gains while preserving flexible spatial dependence on the circular scale. We discuss key properties of the model, including its identifiability and dependence characteristics. The model fitting involves standard Markov chain Monte Carlo techniques. Through extensive simulation studies and an application to the wave direction data across the Indian Ocean during the 2004 Indian Ocean Tsunami, we compare the proposed method with both a non-spatial wrapped Gaussian model and a low-rank WGP alternative. The results demonstrate that the WGMRF offers improved predictive performance and scalability in large-domain applications.
Paper Structure (28 sections, 26 equations, 7 figures, 2 tables)

This paper contains 28 sections, 26 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Gridded wave direction data obtained from the ERA5 global wave reanalysis, corresponding to the hour of the 2004 Indian Ocean Tsunami (01:00 hour UTC), covering the entire Indian Ocean basin.
  • Figure 2: Left: Circular histogram of the wave direction dataset. Right: The empirical semivariograms of sine and cosine transformations of the wave direction dataset. Here, the $X$-axis denotes the geodesic distances.
  • Figure 3: Triangulated mesh over the entire Indian Ocean basin, which we use to develop the spatial process $Z(\cdot)$ in \ref{['eq:spde']}, using SPDEs. Given the spherical nature of the globe, we use crs = inla.CRS("+proj=longlat +datum=WGS84"). Here, the maximum triangle edge length near data and in outer extension are set to $1^\circ$ and $3^\circ$. The extension before the outer layer and for the coarsest triangles are set to $1^\circ$ and $10^\circ$, respectively. For choosing the boundary, we use a non-convex hull with the shrinkage amount $0.005^\circ$. Overall, the mesh includes 8145 nodes.
  • Figure 4: Top panel: Histograms of the pointwise sine-cosine squared differences $(\cos[Y(\bm{s}_i)] - \cos[\widehat{Y}(\bm{s}_i)])^2 + (\sin[Y(\bm{s}_i)] - \sin[\widehat{Y}(\bm{s}_i)])^2$ based on the non-spatial wrapped normal distribution (IID WN), the low-rank WGP in Section \ref{['subsec:lowrank_wgp']}, and the final proposed model (WGMRF) in \ref{['eq:final_model']}. Bottom panel: Histograms of the pointwise posterior predictive concentration based on the same three models.
  • Figure 5: Top panel: Under a 10-fold cross-validation, histograms of the pointwise sine-cosine squared differences $(\cos[Y(\bm{s}_i)] - \cos[\widehat{Y}(\bm{s}_i)])^2 + (\sin[Y(\bm{s}_i)] - \sin[\widehat{Y}(\bm{s}_i)])^2$ based on the non-spatial wrapped normal distribution (IID WN), the low-rank WGP in Section \ref{['subsec:lowrank_wgp']}, and the final proposed model (WGMRF) in \ref{['eq:final_model']}. Bottom panel: Histograms of the pointwise posterior predictive concentration under the same 10-fold cross-validation based on the same three models.
  • ...and 2 more figures