News and Load: A Quantitative Exploration of Natural Language Processing Applications for Forecasting Day-ahead Electricity System Demand
Yun Bai, Simon Camal, Andrea Michiorri
TL;DR
The study investigates how unstructured public news text can improve day-ahead electricity demand forecasting by linking textual signals to load through an end-to-end forecasting chain. It combines statistical, semantic, and embedding-based text features with traditional predictors in an ExtraTrees regression framework, using Granger filtering, LIME for local explainability, and Double ML for causality analysis. Evaluations on UK and Northern Ireland demand with a BBC-news corpus show consistent forecast improvements (roughly 4–11% across RMSE, MAE, and SMAPE), with best gains from a feature set that includes word-frequency signals from news titles, sentiment scores from news bodies, and GloVe embeddings. The work also reveals global/local correlations and time-varying causal effects, offering a foundation for incorporating social and economic dynamics into energy forecasting and prompting future research with broader data sources and higher spatial granularity.
Abstract
The relationship between electricity demand and weather is well established in power systems, along with the importance of behavioral and social aspects such as holidays and significant events. This study explores the link between electricity demand and more nuanced information about social events. This is done using mature Natural Language Processing (NLP) and demand forecasting techniques. The results indicate that day-ahead forecasts are improved by textual features such as word frequencies, public sentiments, topic distributions, and word embeddings. The social events contained in these features include global pandemics, politics, international conflicts, transportation, etc. Causality effects and correlations are discussed to propose explanations for the mechanisms behind the links highlighted. This study is believed to bring a new perspective to traditional electricity demand analysis. It confirms the feasibility of improving forecasts from unstructured text, with potential consequences for sociology and economics.
