Table of Contents
Fetching ...

Modeling group heterogeneity in spatio-temporal data via physics-informed semiparametric regression

Marco F. De Sanctis, Eleonora Arnone, Francesca Ieva, Laura M. Sangalli

TL;DR

This work addresses spatio-temporal data with grouping structures by introducing a physics-informed semiparametric mixed effects model that combines fixed covariates $Xβ$, a shared nonparametric field $f$ regularized through a space–time PDE operator $oldsymbol{L}$, and group-specific random effects with covariance $Σ_b$. Estimation proceeds via a two-step FPIRLS algorithm, with an EM step to update the random-effects covariance, and discretization via finite elements in space and cubic B-splines in time. The approach is validated through simulations showing improved recovery of the nonparametric field and competitive estimation of fixed and random effects, and is demonstrated on Lombardy $NO_2$ data where sensor heterogeneity and missing observations are handled by the model. The work provides asymptotic guarantees, a scalable estimation framework, and practical insights for incorporating physical dynamics and grouping structure into spatio-temporal analyses, with extensions to anisotropy and irregular domains.

Abstract

In this work we propose a novel approach for modeling spatio-temporal data characterized by group structures. In particular, we extend classical mixed effect regression models by introducing a space-time nonparametric component, regularized through a partial differential equation, to embed the physical dynamics of the underlying process, while random effects capture latent variability associated with the group structure present in the data. We propose a two-step procedure to estimate the fixed and random components of the model, relying on a functional version of the Iterative Reweighted Least Squares algorithm. We investigate the asymptotic properties of both fixed and random components, and we assess the performance of the proposed model through a simulation study, comparing it with state-of-the-art alternatives from the literature. The proposed methodology is finally applied to the study of hourly nitrogen dioxide concentration data in Lombardy (Italy), using random effects to account for measurement heterogeneity across monitoring stations equipped with different sensor technologies.

Modeling group heterogeneity in spatio-temporal data via physics-informed semiparametric regression

TL;DR

This work addresses spatio-temporal data with grouping structures by introducing a physics-informed semiparametric mixed effects model that combines fixed covariates , a shared nonparametric field regularized through a space–time PDE operator , and group-specific random effects with covariance . Estimation proceeds via a two-step FPIRLS algorithm, with an EM step to update the random-effects covariance, and discretization via finite elements in space and cubic B-splines in time. The approach is validated through simulations showing improved recovery of the nonparametric field and competitive estimation of fixed and random effects, and is demonstrated on Lombardy data where sensor heterogeneity and missing observations are handled by the model. The work provides asymptotic guarantees, a scalable estimation framework, and practical insights for incorporating physical dynamics and grouping structure into spatio-temporal analyses, with extensions to anisotropy and irregular domains.

Abstract

In this work we propose a novel approach for modeling spatio-temporal data characterized by group structures. In particular, we extend classical mixed effect regression models by introducing a space-time nonparametric component, regularized through a partial differential equation, to embed the physical dynamics of the underlying process, while random effects capture latent variability associated with the group structure present in the data. We propose a two-step procedure to estimate the fixed and random components of the model, relying on a functional version of the Iterative Reweighted Least Squares algorithm. We investigate the asymptotic properties of both fixed and random components, and we assess the performance of the proposed model through a simulation study, comparing it with state-of-the-art alternatives from the literature. The proposed methodology is finally applied to the study of hourly nitrogen dioxide concentration data in Lombardy (Italy), using random effects to account for measurement heterogeneity across monitoring stations equipped with different sensor technologies.

Paper Structure

This paper contains 11 sections, 4 theorems, 29 equations, 8 figures.

Key Result

Proposition 3.1

For a given pair of fixed effects $(\hat{\boldsymbol{\beta}}, \hat{f})$, the maximizer of the conditional likelihood of model (eq:model_e) is given by:

Figures (8)

  • Figure 1: Top panel: spatial distribution of square root NO$_2$ concentrations in Lombardy, at three representative hours of the considered day (08:00, 16:00, and 21:00). Bottom panel: hourly temporal profile of the square root NO$_2$ concentrations across ARPA monitoring stations on $15$ January $2019$ (left); spatial distribution of sensor technology types across the region (right).
  • Figure 2: Wind vector field in Lombardy on $15$ January $2019$. Intensity and direction data are provided by $119$ monitoring stations.
  • Figure 3: Estimated nonparametric maps at some of the considered time instants. The first row presents the true field, the second row shows the data for a fixed replica, while the subsequent rows display the estimates (averaged over the $30$ replicas) for each of the competing methods: the proposed Mixed Effect Spatio-Temporal Regression with Partial Differential Equation regularization (MEST-PDE); its isotropic counterpart (MEST-ISO); thin-plate-spline based on nlme (TPS) and on lme4 (TPS4); soap film smoothing based on nlme (SOAP) and on lme4 (SOAP4).
  • Figure 4: Accuracy comparison of fixed effects estimates provided by the competing methods: the proposed Mixed Effect Spatio-Temporal Regression with Partial Differential Equation regularization (MEST-PDE); its isotropic counterpart (MEST-ISO); thin-plate-spline based on nlme (TPS) and on lme4 (TPS4); soap film smoothing based on nlme (SOAP) and on lme4 (SOAP4). Left panel: RMSE of the nonparametric field $f$. Central panel: estimates of $\beta_1$. Right panel: estimates of $\beta_2$.
  • Figure 5: Estimated variance components provided by the competing methods: the proposed Mixed Effect Spatio-Temporal Regression with Partial Differential Equation regularization (MEST-PDE); its isotropic counterpart (MEST-ISO); thin-plate-spline based on nlme (TPS); soap film smoothing based on nlme (SOAP); thin-plate-spline based on lme4 (TPS4); soap film smoothing based on lme4 (SOAP4). Left: estimated standard deviation of the random effects. Right: estimated relative precision factor.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Proposition 3.1
  • Proposition 4.1
  • Proposition 4.2
  • Proposition 4.3