Table of Contents
Fetching ...

Comparison of generalised additive models and neural networks in applications: A systematic review

Jessica Doohan, Lucas Kook, Kevin Burke

TL;DR

The paper conducts a PRISMA-guided systematic review comparing generalized additive models and neural networks on real-world tabular data, synthesizing 143 studies and 430 datasets. Using mixed-effects modelling, it shows no consistent superiority for either approach across RMSE, R^2, and AUC, though neural networks tend to excel on larger, higher-dimensional datasets and their advantage wanes with time. GAMs remain competitive, especially in smaller datasets, and preserve interpretability, highlighting a complementary rather than competitive relationship between the two methods. The study also reveals substantial reporting gaps on dataset characteristics and neural network complexity, underscoring the need for standardized reporting and suggesting that hybrid or ensemble approaches could combine the strengths of both paradigms for practical tabular-data applications.

Abstract

Neural networks have become a popular tool in predictive modelling, more commonly associated with machine learning and artificial intelligence than with statistics. Generalised Additive Models (GAMs) are flexible non-linear statistical models that retain interpretability. Both are state-of-the-art in their own right, with their respective advantages and disadvantages. This paper analyses how these two model classes have performed on real-world tabular data. Following PRISMA guidelines, we conducted a systematic review of papers that performed empirical comparisons of GAMs and neural networks. Eligible papers were identified, yielding 143 papers, with 430 datasets. Key attributes at both paper and dataset levels were extracted and reported. Beyond summarising comparisons, we analyse reported performance metrics using mixed-effects modelling to investigate potential characteristics that can explain and quantify observed differences, including application area, study year, sample size, number of predictors, and neural network complexity. Across datasets, no consistent evidence of superiority was found for either GAMs or neural networks when considering the most frequently reported metrics (RMSE, $R^2$, and AUC). Neural networks tended to outperform in larger datasets and in those with more predictors, but this advantage narrowed over time. Conversely, GAMs remained competitive, particularly in smaller data settings, while retaining interpretability. Reporting of dataset characteristics and neural network complexity was incomplete in much of the literature, limiting transparency and reproducibility. This review highlights that GAMs and neural networks should be viewed as complementary approaches rather than competitors. For many tabular applications, the performance trade-off is modest, and interpretability may favour GAMs.

Comparison of generalised additive models and neural networks in applications: A systematic review

TL;DR

The paper conducts a PRISMA-guided systematic review comparing generalized additive models and neural networks on real-world tabular data, synthesizing 143 studies and 430 datasets. Using mixed-effects modelling, it shows no consistent superiority for either approach across RMSE, R^2, and AUC, though neural networks tend to excel on larger, higher-dimensional datasets and their advantage wanes with time. GAMs remain competitive, especially in smaller datasets, and preserve interpretability, highlighting a complementary rather than competitive relationship between the two methods. The study also reveals substantial reporting gaps on dataset characteristics and neural network complexity, underscoring the need for standardized reporting and suggesting that hybrid or ensemble approaches could combine the strengths of both paradigms for practical tabular-data applications.

Abstract

Neural networks have become a popular tool in predictive modelling, more commonly associated with machine learning and artificial intelligence than with statistics. Generalised Additive Models (GAMs) are flexible non-linear statistical models that retain interpretability. Both are state-of-the-art in their own right, with their respective advantages and disadvantages. This paper analyses how these two model classes have performed on real-world tabular data. Following PRISMA guidelines, we conducted a systematic review of papers that performed empirical comparisons of GAMs and neural networks. Eligible papers were identified, yielding 143 papers, with 430 datasets. Key attributes at both paper and dataset levels were extracted and reported. Beyond summarising comparisons, we analyse reported performance metrics using mixed-effects modelling to investigate potential characteristics that can explain and quantify observed differences, including application area, study year, sample size, number of predictors, and neural network complexity. Across datasets, no consistent evidence of superiority was found for either GAMs or neural networks when considering the most frequently reported metrics (RMSE, , and AUC). Neural networks tended to outperform in larger datasets and in those with more predictors, but this advantage narrowed over time. Conversely, GAMs remained competitive, particularly in smaller data settings, while retaining interpretability. Reporting of dataset characteristics and neural network complexity was incomplete in much of the literature, limiting transparency and reproducibility. This review highlights that GAMs and neural networks should be viewed as complementary approaches rather than competitors. For many tabular applications, the performance trade-off is modest, and interpretability may favour GAMs.

Paper Structure

This paper contains 24 sections, 9 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Feedforward multilayer perceptron neural network.
  • Figure 2: Flow diagram summarising the results of the paper identification, screening, and inclusion.
  • Figure 3: Number of papers plotted by year of publication, coloured by domain area.
  • Figure 4: Proportion of datasets by key characteristics. Reg = regression; Class = classification. Category labels (Small, Medium, Large, Unknown) refer to relative groupings based on sample size, number of predictors, and model complexity.
  • Figure 5: Log performance ratios plotted by year, where greater than zero signifies neural networks are outperforming GAMs and below zero indicates GAM superiority, as shown by arrows.
  • ...and 3 more figures