A Hype-Adjusted Probability Measure for NLP Stock Return Forecasting
Zheng Cao, Helyette Geman
TL;DR
This paper addresses the challenge of forecasting stock returns and volatility using NLP-derived sentiment signals for the semiconductor sector. It introduces a hype-adjusted probability measure, $\mathbb{P}^a$, that reweights the state space to correct for news bias, memory effects, and sentiment-direction shifts, drawing on change-of-measure ideas from asset pricing. The authors develop an adjusted sentiment score equation and integrate it into ML-based forecasting (OLS and logistic regression), demonstrating performance gains including up to about $8\%$ higher return-direction accuracy. The work bridges asset-pricing theory and NLP practice, providing a robust framework to quantify and correct market hype in sentiment-driven forecasts with implications for risk management and policy.
Abstract
This article introduces a Hype-Adjusted Probability Measure in the context of a new Natural Language Processing (NLP) approach for stock return and volatility forecasting. A novel sentiment score equation is proposed to represent the impact of intraday news on forecasting next-period stock return and volatility for selected U.S. semiconductor tickers, a very vibrant industry sector. This work improves the forecast accuracy by addressing news bias, memory, and weight, and incorporating shifts in sentiment direction. More importantly, it extends the use of the remarkable tool of change of Probability Measure developed in the finance of Asset Pricing to NLP forecasting by constructing a Hype-Adjusted Probability Measure, obtained from a redistribution of the weights in the probability space, meant to correct for excessive or insufficient news.
