Table of Contents
Fetching ...

Fusing Sparse Observations and Dense Simulations for Spatial Extreme Value Analysis: Application to U.S. Coastal Sea Levels

Brian N. White, Brian Blanton, Rick Luettich, Richard L. Smith

Abstract

Estimating spatial extremes from sparse observational networks produces uncertain return level maps, but dense output from physics-based simulation models is often available as a complementary data source. We develop a two-stage frequentist frame-work for fusing observations and simulations. In Stage 1, generalized extreme value (GEV) distributions are fitted independently at each site, with a nonstationary location parameter where appropriate to accommodate observed trends. In Stage 2, the parameter estimates from all sources are modeled jointly as a high-dimensional spatial process through a linear model of coregionalization (LMC). Cross-source correlations, estimated from spatially interspersed networks without co-located sites, provide the mechanism for information transfer; an analytic gradient for the resulting likelihood keeps estimation computationally practical. We apply the framework to U.S. coastal sea levels over 1979-2021, fusing 29 NOAA tide gauge records with 100 ADCIRC hydrodynamic simulation sites. Leave-one-out cross-validation shows a 35% reduction in 100-year return level RMSE relative to a gauge-only model. Geographic block cross-validation confirms that fusion benefits persist under spatial extrapolation. The approach is implemented in the R package evfuse.

Fusing Sparse Observations and Dense Simulations for Spatial Extreme Value Analysis: Application to U.S. Coastal Sea Levels

Abstract

Estimating spatial extremes from sparse observational networks produces uncertain return level maps, but dense output from physics-based simulation models is often available as a complementary data source. We develop a two-stage frequentist frame-work for fusing observations and simulations. In Stage 1, generalized extreme value (GEV) distributions are fitted independently at each site, with a nonstationary location parameter where appropriate to accommodate observed trends. In Stage 2, the parameter estimates from all sources are modeled jointly as a high-dimensional spatial process through a linear model of coregionalization (LMC). Cross-source correlations, estimated from spatially interspersed networks without co-located sites, provide the mechanism for information transfer; an analytic gradient for the resulting likelihood keeps estimation computationally practical. We apply the framework to U.S. coastal sea levels over 1979-2021, fusing 29 NOAA tide gauge records with 100 ADCIRC hydrodynamic simulation sites. Leave-one-out cross-validation shows a 35% reduction in 100-year return level RMSE relative to a gauge-only model. Geographic block cross-validation confirms that fusion benefits persist under spatial extrapolation. The approach is implemented in the R package evfuse.
Paper Structure (33 sections, 14 equations, 7 figures, 7 tables)

This paper contains 33 sections, 14 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Study area showing the locations of 29 NOAA tidal gauge stations (circles) and 100 ADCIRC simulation sites (triangles) along the U.S. East and Gulf Coasts.
  • Figure 2: Stage 1 pointwise GEV maximum likelihood estimates at 129 sites. NOAA tidal gauges (circles with outlines) and ADCIRC simulation sites (triangles). Left: location parameter (meters). At NOAA sites this is $\hat{\mu}_0$, the year-2000 intercept from the nonstationary fit; at ADCIRC sites this is the stationary MLE $\hat{\mu}$. Center: $\log\sigma$. Right: shape $\xi$, with diverging color scale centered at zero. The location parameter increases from south to north along the Atlantic coast, while the Gulf Coast exhibits elevated positive $\xi$ values indicating heavy tails.
  • Figure 3: Kriged NOAA-scale GEV parameters at the 29 NOAA sites from the joint model. Left: location $\mu_0$ (year-2000 intercept). Center: $\log\sigma$. Right: shape $\xi$, with diverging color scale centered at zero. Compare with the Stage 1 MLEs in Figure \ref{['fig:stage1']}; the kriged fields are spatially smoother due to borrowing from ADCIRC.
  • Figure 4: Left: Kriged 100-year return levels (meters above MSL) at year-2000 reference conditions along the U.S. East and Gulf Coasts. Large circles show NOAA site-level MLE estimates; small points show joint-model predictions at a 15 km coastal grid. Right: Standard errors of the 100-year return level (meters), via Monte Carlo simulation from the kriging posterior.
  • Figure 5: 100-year return level estimates at year-2000 reference conditions with 95% confidence bands along the coast from New Orleans to Boston. The joint model (red) has narrower uncertainty than NOAA-only (blue) along the Atlantic coast; along the Gulf Coast the bands are comparable. The ADCIRC-only model (green) is systematically lower.
  • ...and 2 more figures