Table of Contents
Fetching ...

A Multi-Fidelity Tensor Emulator for Spatiotemporal Outputs: Emulation of Arctic Sea Ice Dynamics

Tristan Contant, Yawen Guan, Ander Wilson, Adrian K. Turner, Deborah Sulsky

TL;DR

An MF emulator is developed that combines tensor decomposition for dimensionality reduction, Gaussian process priors for flexible function approximation, and an additive discrepancy model to capture systematic differences between LF and HF data and consistently achieves lower prediction error and reduced uncertainty than LF-only and HF-only models.

Abstract

Numerical models are widely used to simulate the earth system, but they are computationally expensive and often depend on many uncertain input parameters. Their effective use requires calibration and uncertainty quantification, which typically involve running the model across many input configurations and therefore incur substantial computational cost. Statistical emulation provides a practical alternative for efficiently exploring model behavior. We are motivated by the Arctic sea ice component of the Energy Exascale Earth System Model (MPAS-Seaice), which generates large spatiotemporal outputs at multiple spatial resolutions, with high-resolution (or high-fidelity, HF) simulations being more accurate but computationally more expensive than lower-resolution (low-fidelity, LF) simulations. Multi-fidelity (MF) emulation integrates information across resolutions to construct efficient and accurate surrogate models, yet existing approaches struggle to scale to large spatiotemporal data. We develop an MF emulator that combines tensor decomposition for dimensionality reduction, Gaussian process priors for flexible function approximation, and an additive discrepancy model to capture systematic differences between LF and HF data. The proposed framework enables scalable emulation while maintaining accurate predictions and well-calibrated uncertainty for complex spatiotemporal fields, and consistently achieves lower prediction error and reduced uncertainty than LF-only and HF-only models in both simulation studies and MPAS-Seaice analysis. By leveraging the complementary strengths of LF and HF data and using an efficient tensor decomposition approach, our emulator greatly reduces computational expense, making it well suited for large-scale simulation tasks involving complex physical models.

A Multi-Fidelity Tensor Emulator for Spatiotemporal Outputs: Emulation of Arctic Sea Ice Dynamics

TL;DR

An MF emulator is developed that combines tensor decomposition for dimensionality reduction, Gaussian process priors for flexible function approximation, and an additive discrepancy model to capture systematic differences between LF and HF data and consistently achieves lower prediction error and reduced uncertainty than LF-only and HF-only models.

Abstract

Numerical models are widely used to simulate the earth system, but they are computationally expensive and often depend on many uncertain input parameters. Their effective use requires calibration and uncertainty quantification, which typically involve running the model across many input configurations and therefore incur substantial computational cost. Statistical emulation provides a practical alternative for efficiently exploring model behavior. We are motivated by the Arctic sea ice component of the Energy Exascale Earth System Model (MPAS-Seaice), which generates large spatiotemporal outputs at multiple spatial resolutions, with high-resolution (or high-fidelity, HF) simulations being more accurate but computationally more expensive than lower-resolution (low-fidelity, LF) simulations. Multi-fidelity (MF) emulation integrates information across resolutions to construct efficient and accurate surrogate models, yet existing approaches struggle to scale to large spatiotemporal data. We develop an MF emulator that combines tensor decomposition for dimensionality reduction, Gaussian process priors for flexible function approximation, and an additive discrepancy model to capture systematic differences between LF and HF data. The proposed framework enables scalable emulation while maintaining accurate predictions and well-calibrated uncertainty for complex spatiotemporal fields, and consistently achieves lower prediction error and reduced uncertainty than LF-only and HF-only models in both simulation studies and MPAS-Seaice analysis. By leveraging the complementary strengths of LF and HF data and using an efficient tensor decomposition approach, our emulator greatly reduces computational expense, making it well suited for large-scale simulation tasks involving complex physical models.
Paper Structure (15 sections, 18 equations, 6 figures, 2 tables)

This paper contains 15 sections, 18 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Run time comparison for 60 km (low-fidelity, LF) and 30 km (high-fidelity, HF) MPAS-Seaice simulations. The HF model is slower at low processor counts but benefits more from additional processors, while the LF model shows diminishing speedup. LF simulations are up to 16 times cheaper in computational cost than the HF simulations.
  • Figure 2: Logit-based transformation of $[0,1]$-bounded MPAS-Seaice outputs to the real line. Values are truncated to $[0.01, 0.99]$ and then mapped using a compressed logit function, resulting in a near-linear mapping between 0.1 and 0.99. The left panel shows the full transformation, and the right panel a zoomed-in view. This ensures that the data are suitable for the proposed methodology, which is broadly applicable to real-valued data and enables the use of basis decomposition methods such as Tucker decomposition.
  • Figure 3: Results from the simulation study. Violin plots comparing MSE, SD, and 95% credible interval (gray line indicates target) coverage for four emulators: tensor emulators trained on low-fidelity (LF) and high-fidelity (HF) data, a multi-fidelity (MF) tensor emulator, and a baseline Gaussian process emulator fit independently at each spatial and temporal location (Naïve-GP). Metrics are averaged over spatial location and time.
  • Figure 4: First three spatial, monthly, and yearly bases from the Tucker decomposition of low-fidelity MPAS-Seaice data. Spatial bases capture typical ice transitions, monthly bases reflect the seasonal cycle, and yearly bases show long-term decline in sea ice extent. Relative contributions are shown as percentages of total variance for each mode, with the leading basis in every mode explaining over 90% of the variance.
  • Figure 5: Monthly and yearly performance of emulators on MPAS-Seaice data. Mean across LOO-CV with 2.5%--95.5% LOO-CV quantile bands. Seasonal patterns are evident, with increased bias and uncertainty from June through September reflecting greater variability in MPAS-Seaice outputs during those months. The Naïve-GP shows lower MSE and SD in colder months only because, at many spatiotemporal locations, the outputs are constant across all inputs (all ice or no ice); in warmer months, its per-location GPs fail to capture spatiotemporal dependencies, leading to reduced accuracy and inflated uncertainty.
  • ...and 1 more figures