Table of Contents
Fetching ...

Macroeconomic Predictions using Payments Data and Machine Learning

James T. E. Chapman, Ajit Desai

TL;DR

The paper develops a real-time macroeconomic nowcasting framework that leverages granular, timelike payments data from Canada's ACSS and LVTS with nonlinear ML methods. It implements a crisis-aware cross-validation scheme and SHAP-based interpretability to address overfitting and transparency, respectively. The results show substantial RMSE reductions—up to $40\%$—over linear benchmarks, with larger gains during the COVID-19 crisis, and reveal that payments signals are most valuable in crisis periods and when used for the current month’s nowcast. This approach enhances policy-relevant nowcasting by delivering timely indicators (GDP, RTS, WTS) with clear predictor attributions, supporting decision-makers during crises.

Abstract

Predicting the economy's short-term dynamics -- a vital input to economic agents' decision-making process -- often uses lagged indicators in linear models. This is typically sufficient during normal times but could prove inadequate during crisis periods. This paper aims to demonstrate that non-traditional and timely data such as retail and wholesale payments, with the aid of nonlinear machine learning approaches, can provide policymakers with sophisticated models to accurately estimate key macroeconomic indicators in near real-time. Moreover, we provide a set of econometric tools to mitigate overfitting and interpretability challenges in machine learning models to improve their effectiveness for policy use. Our models with payments data, nonlinear methods, and tailored cross-validation approaches help improve macroeconomic nowcasting accuracy up to 40\% -- with higher gains during the COVID-19 period. We observe that the contribution of payments data for economic predictions is small and linear during low and normal growth periods. However, the payments data contribution is large, asymmetrical, and nonlinear during strong negative or positive growth periods.

Macroeconomic Predictions using Payments Data and Machine Learning

TL;DR

The paper develops a real-time macroeconomic nowcasting framework that leverages granular, timelike payments data from Canada's ACSS and LVTS with nonlinear ML methods. It implements a crisis-aware cross-validation scheme and SHAP-based interpretability to address overfitting and transparency, respectively. The results show substantial RMSE reductions—up to —over linear benchmarks, with larger gains during the COVID-19 crisis, and reveal that payments signals are most valuable in crisis periods and when used for the current month’s nowcast. This approach enhances policy-relevant nowcasting by delivering timely indicators (GDP, RTS, WTS) with clear predictor attributions, supporting decision-makers during crises.

Abstract

Predicting the economy's short-term dynamics -- a vital input to economic agents' decision-making process -- often uses lagged indicators in linear models. This is typically sufficient during normal times but could prove inadequate during crisis periods. This paper aims to demonstrate that non-traditional and timely data such as retail and wholesale payments, with the aid of nonlinear machine learning approaches, can provide policymakers with sophisticated models to accurately estimate key macroeconomic indicators in near real-time. Moreover, we provide a set of econometric tools to mitigate overfitting and interpretability challenges in machine learning models to improve their effectiveness for policy use. Our models with payments data, nonlinear methods, and tailored cross-validation approaches help improve macroeconomic nowcasting accuracy up to 40\% -- with higher gains during the COVID-19 period. We observe that the contribution of payments data for economic predictions is small and linear during low and normal growth periods. However, the payments data contribution is large, asymmetrical, and nonlinear during strong negative or positive growth periods.
Paper Structure (27 sections, 12 equations, 20 figures, 5 tables)

This paper contains 27 sections, 12 equations, 20 figures, 5 tables.

Figures (20)

  • Figure 1: Standardized YOY growth rate comparisons of GDP, RTS, and WTS, with selected payment streams. Gray highlighting--GFC period; blue highlighting--COVID-19 period. Note: AFT credit includes Government direct deposit, encoded paper is the sum of multiple streams settled separately in the ACSS, POS payments include online payments, and corporate payments is the sum of paper remittances, EDI payments, and EDI remittances.
  • Figure 2: (Top) Schematic of standard expanding window approach for cross-validation in time series. The dataset is divided into a training set with validation subsets and a test set (highlighted in blue). (Bottom) Schematic of the proposed randomized expanding window approach showing a typical validation subsets (represented by ${\bullet}$) randomly sampled from the validation superset (highlighted in gray). In both plots, the orange line shows the GDP growth rate.
  • Figure 3: Schematic of expanding window approach for a typical fold in $k$-folds cross-validation and out-of-sample prediction. The available data are divided into training, validation, and test sets. For the given iterations of the expanding window (Iter), ${\bullet}$ represents in-sample training points and ${\bullet}$ represents out-of-sample test points (for the fold). For each iteration in this fold of cross-validation, we use randomly sampled ${\bullet}$ points from the validation superset as the validation subset. Note: the out-of-sample size (the number of ${\bullet}$ points) in each validation subset is kept similar to the test set. For instance, both the validation subset and test set have five out-of-sample points each in this schematic.
  • Figure 4: GDP: SHAP global feature importance measured as mean absolute Shapley values for each instance in the entire training sample (Mar 2005 to Dec 2020). The top 20 features are ranked from high (top) to low (bottom) based on average Shapley values.
  • Figure 5: GDP: SHAP global feature importance measured as mean absolute Shapley values of each instance in the training sample for the COVID-19 period (Mar 2020 to Dec 2020). The features are ranked from high (top) to low (bottom) based on average Shapley values.
  • ...and 15 more figures