Table of Contents
Fetching ...

Integrating Complex Covariate Transformations in Generalized Additive Models

Claudia Collarin, Matteo Fasiolo, Yannig Goude, Simon N. Wood

Abstract

Transformations of covariates are widely used in applied statistics to improve interpretability and to satisfy assumptions required for valid inference. More broadly, feature engineering encompasses a wider set of practices aimed at enhancing predictive performance, and is typically performed as part of a data pre-processing step. In contrast, this paper integrates a substantial component of the feature engineering process directly into the modelling stage. This is achieved by introducing a novel general framework for embedding interpretable covariate transformations within multi-parameter Generalised Additive Models (GAMs). Our framework accommodates any sufficiently differentiable scalar-valued transformation of potentially high-dimensional and complex covariates. These transformations are treated as integral model components, with their parameters estimated jointly with regression coefficients via maximum a posteriori (MAP) methods, and joint uncertainty quantified via approximate Bayesian techniques. Smoothing parameters are selected in an empirical Bayes framework using a Laplace approximation to the marginal likelihood, supported by efficient computation based on implicit differentiation methods. We demonstrate the flexibility and practical value of the proposed methodology through applications to forecasting electricity net-demand in Great Britain and to modelling house prices in London. Methods for building and fitting GAMs with nested transformations are provided by the gamFactory R package, available at https://github.com/mfasiolo/gamFactory, while the code for reproducing the results in this paper is available at https://doi.org/10.5281/zenodo.19239350.

Integrating Complex Covariate Transformations in Generalized Additive Models

Abstract

Transformations of covariates are widely used in applied statistics to improve interpretability and to satisfy assumptions required for valid inference. More broadly, feature engineering encompasses a wider set of practices aimed at enhancing predictive performance, and is typically performed as part of a data pre-processing step. In contrast, this paper integrates a substantial component of the feature engineering process directly into the modelling stage. This is achieved by introducing a novel general framework for embedding interpretable covariate transformations within multi-parameter Generalised Additive Models (GAMs). Our framework accommodates any sufficiently differentiable scalar-valued transformation of potentially high-dimensional and complex covariates. These transformations are treated as integral model components, with their parameters estimated jointly with regression coefficients via maximum a posteriori (MAP) methods, and joint uncertainty quantified via approximate Bayesian techniques. Smoothing parameters are selected in an empirical Bayes framework using a Laplace approximation to the marginal likelihood, supported by efficient computation based on implicit differentiation methods. We demonstrate the flexibility and practical value of the proposed methodology through applications to forecasting electricity net-demand in Great Britain and to modelling house prices in London. Methods for building and fitting GAMs with nested transformations are provided by the gamFactory R package, available at https://github.com/mfasiolo/gamFactory, while the code for reproducing the results in this paper is available at https://doi.org/10.5281/zenodo.19239350.

Paper Structure

This paper contains 33 sections, 64 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Examples of smooth effects with nested covariate transformations. Exponential smooths of forecast temperature in GB (a) and the corresponding smooth effects on electricity net-demand (b) for two distinct exponential smoothing parameters, estimated by model \ref{['eq:netdemand']}. Local house prices in London (d) and central London (e), obtained by kernel smoothing. The radius of the blue circle ($\approx 520$ meters) is twice the bandwidth of the kernel estimated by model \ref{['eq:mod_house']}. The estimated multiplicative effect of kernel-smoothed prices on local expected house prices is shown in plot (c). The reference dashed line has unit slope and the five ticks at the top are price quantiles.
  • Figure 2: Scaling issues in linear combinations. Top row: Scatterplot plot of data to be combined via two unit-vectors (a) and densities of the transformed data (b). Bottom row: Same as on the top row, but here the norm of the vectors is adjusted so that the scale of the transformed data is invariant w.r.t. the direction of the combination vector.
  • Figure 3: Inner single index coefficients and outer smooth effect of wind speed (a-b) and of net-demand lags (c-d).
  • Figure 4: The effect of neighbouring prices, $s^{10}\left[\tilde{s}^{\text{mgks}}(\mathbf{x}_{i})\right]$ (a), the spatial effect, $f_3^{400}(\text{lon}_i, \text{lat}_i)$ (b) and their joint effect on expected prices (c), estimated under model (\ref{['eq:mod_house']}). The estimated spatial effect, $f_3^{2000}(\text{lon}_i, \text{lat}_i)$ (d), under a standard model that does not include the spatial autoregressive term. The effects have been exponentiated to obtain multiplicative effects on the original, rather than logarithmic, price scale.
  • Figure 5: Multiplicative effects of IMD (a) and distance to the nearest tube station (b), estimated by model \ref{['eq:mod_house']} and a model using $k = 2000$ basis functions for the standard spatial effect, but no autoregressive effect. The $x$-axis in (b) is on a square root scale.
  • ...and 2 more figures