Table of Contents
Fetching ...

Forecasting GDP in Europe with Textual Data

Luca Barbaglia, Sergio Consoli, Sebastiano Manzan

TL;DR

The paper develops FiGAS-based, aspect-specific sentiment indicators from a large multilingual news corpus to forecast GDP and other macro variables for five European economies. By integrating these indicators through a mixed-frequency framework (U-MIDAS) and robust inference (double-lasso with multiple testing adjustments), the authors demonstrate incremental predictive power over standard macro and survey signals, with effects varying by country and horizon. Out-of-sample tests reveal substantial reductions in forecast errors, particularly at longer horizons and during recessions, while robustness checks extend findings to unemployment, IPI, and CPI. The work highlights the value of high-frequency, text-derived sentiment for real-time economic monitoring and suggests avenues for further refinement, including expanded vocabularies and nonlinear modeling. The practical impact lies in providing forecasters with timely, country-specific signals that complement traditional indicators, improving nowcasting and forecasting under real-time data constraints.

Abstract

We evaluate the informational content of news-based sentiment indicators for forecasting Gross Domestic Product (GDP) and other macroeconomic variables of the five major European economies. Our data set includes over 27 million articles for 26 major newspapers in 5 different languages. The evidence indicates that these sentiment indicators are significant predictors to forecast macroeconomic variables and their predictive content is robust to controlling for other indicators available to forecasters in real-time.

Forecasting GDP in Europe with Textual Data

TL;DR

The paper develops FiGAS-based, aspect-specific sentiment indicators from a large multilingual news corpus to forecast GDP and other macro variables for five European economies. By integrating these indicators through a mixed-frequency framework (U-MIDAS) and robust inference (double-lasso with multiple testing adjustments), the authors demonstrate incremental predictive power over standard macro and survey signals, with effects varying by country and horizon. Out-of-sample tests reveal substantial reductions in forecast errors, particularly at longer horizons and during recessions, while robustness checks extend findings to unemployment, IPI, and CPI. The work highlights the value of high-frequency, text-derived sentiment for real-time economic monitoring and suggests avenues for further refinement, including expanded vocabularies and nonlinear modeling. The practical impact lies in providing forecasters with timely, country-specific signals that complement traditional indicators, improving nowcasting and forecasting under real-time data constraints.

Abstract

We evaluate the informational content of news-based sentiment indicators for forecasting Gross Domestic Product (GDP) and other macroeconomic variables of the five major European economies. Our data set includes over 27 million articles for 26 major newspapers in 5 different languages. The evidence indicates that these sentiment indicators are significant predictors to forecast macroeconomic variables and their predictive content is robust to controlling for other indicators available to forecasters in real-time.
Paper Structure (9 sections, 1 equation, 6 figures, 4 tables)

This paper contains 9 sections, 1 equation, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Time series of the standardized news-based sentiment indicators for Germany (DE), Spain (ES), France (FR), Italy (IT), and the United Kingdom (UK). The sentiment is averaged within the quarter and sampled at the quarterly frequency. The shaded areas represent the recessions established by the CEPR-EABCN business cycle dating committee.
  • Figure 2: Kernel density estimates of the standardized sentiment indicators during expansions and recessions by country and topic. Expansions and recessions are based on the classification of the CEPR-EABCN business cycle dating committee.
  • Figure 3: Correlation coefficients of the sentiment indicators across countries. Darker colors indicate larger correlation in absolute value.
  • Figure 4: Statistical significance of the survey and sentiment measures as predictors of GDP growth by country with p-values corrected for multiple testing across horizons. The grey area represents the quarter being forecast and release indicates the release date. The $x$-axis reports the horizon $h$, which ranges from 15 days before the release date to approximately 4 quarters ahead at intervals of 15 days. The color of the tile represents the p-value of the coefficient of the survey or sentiment indicators $\eta_h$ in Equation \ref{['eqn:eq.dl']}: the darker the tile, the smaller the p-value.
  • Figure 5: Ratio of the MSFE for the ARXS specification relative to the ARX benchmark across horizons. The grey area represents the quarter being forecast and rel. indicates the release date. The $x$-axis reports the horizon $h$, which ranges from 15 days before the release date to approximately 4 quarters ahead at intervals of 15 days.
  • ...and 1 more figures