Table of Contents
Fetching ...

CNN-TFT explained by SHAP with multi-head attention weights for time series forecasting

Stefano F. Stefenon, João P. Matos-Carvalho, Valderi R. Q. Leithardt, Kin-Choong Yow

TL;DR

The paper introduces CNN-TFT-SHAP-MHAW, a hybrid architecture that fuses convolutional feature extraction with a temporal fusion transformer backbone, augmented by SHAP-derived explanations integrated with multi-head attention. It demonstrates strong forecasting performance on hydroelectric flow data, with explainability analysis showing that recent lags most influence predictions. Bayesian optimization tunes hyperparameters to enhance accuracy, and the combined SHAP-attention visualization provides a more trustworthy interpretation of model decisions. The approach advances high-fidelity, interpretable multivariate time-series forecasting with potential applicability across energy and other domains requiring transparent predictions.

Abstract

Convolutional neural networks (CNNs) and transformer architectures offer strengths for modeling temporal data: CNNs excel at capturing local patterns and translational invariances, while transformers effectively model long-range dependencies via self-attention. This paper proposes a hybrid architecture integrating convolutional feature extraction with a temporal fusion transformer (TFT) backbone to enhance multivariate time series forecasting. The CNN module first applies a hierarchy of one-dimensional convolutional layers to distill salient local patterns from raw input sequences, reducing noise and dimensionality. The resulting feature maps are then fed into the TFT, which applies multi-head attention to capture both short- and long-term dependencies and to weigh relevant covariates adaptively. We evaluate the CNN-TFT on a hydroelectric natural flow time series dataset. Experimental results demonstrate that CNN-TFT outperforms well-established deep learning models, with a mean absolute percentage error of up to 2.2%. The explainability of the model is obtained by a proposed Shapley additive explanations with multi-head attention weights (SHAP-MHAW). Our novel architecture, named CNN-TFT-SHAP-MHAW, is promising for applications requiring high-fidelity, multivariate time series forecasts, being available for future analysis at https://github.com/SFStefenon/CNN-TFT-SHAP-MHAW .

CNN-TFT explained by SHAP with multi-head attention weights for time series forecasting

TL;DR

The paper introduces CNN-TFT-SHAP-MHAW, a hybrid architecture that fuses convolutional feature extraction with a temporal fusion transformer backbone, augmented by SHAP-derived explanations integrated with multi-head attention. It demonstrates strong forecasting performance on hydroelectric flow data, with explainability analysis showing that recent lags most influence predictions. Bayesian optimization tunes hyperparameters to enhance accuracy, and the combined SHAP-attention visualization provides a more trustworthy interpretation of model decisions. The approach advances high-fidelity, interpretable multivariate time-series forecasting with potential applicability across energy and other domains requiring transparent predictions.

Abstract

Convolutional neural networks (CNNs) and transformer architectures offer strengths for modeling temporal data: CNNs excel at capturing local patterns and translational invariances, while transformers effectively model long-range dependencies via self-attention. This paper proposes a hybrid architecture integrating convolutional feature extraction with a temporal fusion transformer (TFT) backbone to enhance multivariate time series forecasting. The CNN module first applies a hierarchy of one-dimensional convolutional layers to distill salient local patterns from raw input sequences, reducing noise and dimensionality. The resulting feature maps are then fed into the TFT, which applies multi-head attention to capture both short- and long-term dependencies and to weigh relevant covariates adaptively. We evaluate the CNN-TFT on a hydroelectric natural flow time series dataset. Experimental results demonstrate that CNN-TFT outperforms well-established deep learning models, with a mean absolute percentage error of up to 2.2%. The explainability of the model is obtained by a proposed Shapley additive explanations with multi-head attention weights (SHAP-MHAW). Our novel architecture, named CNN-TFT-SHAP-MHAW, is promising for applications requiring high-fidelity, multivariate time series forecasts, being available for future analysis at https://github.com/SFStefenon/CNN-TFT-SHAP-MHAW .

Paper Structure

This paper contains 14 sections, 14 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Summarized architecture of the proposed CNN-TFT-SHAP-MHAW.
  • Figure 2: Original flow time series data.
  • Figure 3: RMSE of the trial using Bayesian optimization for hypertuning.
  • Figure 4: RMSE gradients of number of heads versus CNN layers.
  • Figure 5: Example of original versus predicted time series by CNN-TFT-SHAP-MHAW.
  • ...and 2 more figures