Table of Contents
Fetching ...

Supervised Learning of Functional Outcomes with Predictors at Different Scales: A Functional Gaussian Process Approach

R. Jacob Andros, Rajarshi Guhaniyogi, Devin Francom, Donatella Pasqualini

TL;DR

Let $Y_s(\mathbf{u})$ denote the functional outcome observed over a spatial domain $\mathcal{D}$. This paper addresses learning such outcomes when predictors exist at two scales: fixed-domain functional predictors and realization-varying global predictors. The authors introduce an additive regression model with spatially varying coefficient functions for the functional predictors and a novel functional Gaussian process (fGP) prior to jointly model the nonlinear effects of global predictors across space. The approach yields principled uncertainty quantification and improved predictive performance, demonstrated via extensive simulations and an emulator analysis of the SLOSH hurricane model. Overall, the framework advances functional data analysis for computer experiments by integrating multi-scale predictors with spatially structured, uncertainty-aware inference.

Abstract

The analysis of complex computer simulations, often involving functional data, presents unique statistical challenges. Conventional regression methods, such as function-on-function regression, typically associate functional outcomes with both scalar and functional predictors on a per-realization basis. However, simulation studies often demand a more nuanced approach to disentangle nonlinear relationships of functional outcome with predictors observed at multiple scales: domain-specific functional predictors that are fixed across simulation runs, and realization-specific global predictors that vary between runs. In this article, we develop a novel supervised learning framework tailored to this setting. We propose an additive nonlinear regression model that flexibly captures the influence of both predictor types. The effects of functional predictors are modeled through spatially-varying coefficients governed by a Gaussian process prior. Crucially, to capture the impact of global predictors on the functional outcome, we introduce a functional Gaussian process (fGP) prior. This new prior jointly models the entire collection of unknown, spatially-indexed nonlinear functions that encode the effects of the global predictors over the entire domain, explicitly accounting for their spatial dependence. This integrated architecture enables simultaneous learning from both predictor types, provides a principled strategies to quantify their respective contributions in predicting the functional outcome, and delivers rigorous uncertainty estimates for both model parameters and predictions. The utility and robustness of our approach are demonstrated through multiple synthetic datasets and a real-world application involving outputs from the Sea, Lake, and Overland Surges from Hurricanes (SLOSH) model.

Supervised Learning of Functional Outcomes with Predictors at Different Scales: A Functional Gaussian Process Approach

TL;DR

Let denote the functional outcome observed over a spatial domain . This paper addresses learning such outcomes when predictors exist at two scales: fixed-domain functional predictors and realization-varying global predictors. The authors introduce an additive regression model with spatially varying coefficient functions for the functional predictors and a novel functional Gaussian process (fGP) prior to jointly model the nonlinear effects of global predictors across space. The approach yields principled uncertainty quantification and improved predictive performance, demonstrated via extensive simulations and an emulator analysis of the SLOSH hurricane model. Overall, the framework advances functional data analysis for computer experiments by integrating multi-scale predictors with spatially structured, uncertainty-aware inference.

Abstract

The analysis of complex computer simulations, often involving functional data, presents unique statistical challenges. Conventional regression methods, such as function-on-function regression, typically associate functional outcomes with both scalar and functional predictors on a per-realization basis. However, simulation studies often demand a more nuanced approach to disentangle nonlinear relationships of functional outcome with predictors observed at multiple scales: domain-specific functional predictors that are fixed across simulation runs, and realization-specific global predictors that vary between runs. In this article, we develop a novel supervised learning framework tailored to this setting. We propose an additive nonlinear regression model that flexibly captures the influence of both predictor types. The effects of functional predictors are modeled through spatially-varying coefficients governed by a Gaussian process prior. Crucially, to capture the impact of global predictors on the functional outcome, we introduce a functional Gaussian process (fGP) prior. This new prior jointly models the entire collection of unknown, spatially-indexed nonlinear functions that encode the effects of the global predictors over the entire domain, explicitly accounting for their spatial dependence. This integrated architecture enables simultaneous learning from both predictor types, provides a principled strategies to quantify their respective contributions in predicting the functional outcome, and delivers rigorous uncertainty estimates for both model parameters and predictions. The utility and robustness of our approach are demonstrated through multiple synthetic datasets and a real-world application involving outputs from the Sea, Lake, and Overland Surges from Hurricanes (SLOSH) model.
Paper Structure (10 sections, 7 equations, 6 figures, 2 tables)

This paper contains 10 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: How the response variable $Y_s( {\boldsymbol u} )$ is constructed from the functional and global predictors. We assume that the data are observed at only $n$ distinct spatial locations in the domain, given by $\mathcal{U}=\{ {\boldsymbol u} _1,..., {\boldsymbol u} _n\}$.
  • Figure 2: Overview of all model parameters and data used in the proposed model. For illustration, we consider $\nu_{\beta,j}=\nu_k=1/2$, leading to the exponential covariance function for $C_k(\cdot,\cdot; {\boldsymbol \theta} _{k})$ and $C_{\beta,j}(\cdot,\cdot; {\boldsymbol \theta} _{\beta,j})$
  • Figure 3: True and predicted response surfaces produced by the fGP model for a representative out-of-sample simulation from the $S_{test}$ test simulations in Scenarios 1-4.
  • Figure 4: Posterior distributions of $\tau^2$ under Scenarios 1-4. The blue dotted line marks the true value of $\tau^2$. All plots show accurate estimation of the error variance in the four simulation scenarios.
  • Figure 5: True and predicted response surfaces produced by the fGP model for a representative out-of-sample simulation from the $S_{test}$ test simulations in Scenarios 5–8, which illustrate cases of model misspecification.
  • ...and 1 more figures