MTMD: Multi-Scale Temporal Memory Learning and Efficient Debiasing Framework for Stock Trend Forecasting
Mingjie Wang, Juanxi Tian, Mingze Zhang, Jianxiong Guo, Weijia Jia
TL;DR
MTMD tackles stock trend forecasting by explicitly modeling multi-scale temporal dependencies and debiasing noisy signals through a memory-based framework. It introduces memory items and a memory-aware aggregator to fuse local stock-concept features with global profit-pattern signals from past data, using predefined and hidden concept blocks to capture both known and latent factors. Across CSI 100/300/500 datasets, MTMD achieves state-of-the-art IC, Rank IC, and Precision@N, with ablations confirming the memory module’s critical role in debiasing and robustness, while offering favorable computational efficiency ($O(T)$) relative to recurrent approaches. The approach is plug-and-play with alternative backbones, and its graph-enabled fusion of global-local information yields practical improvements for real-market forecasting and decision-making.
Abstract
The endeavor of stock trend forecasting is principally focused on predicting the future trajectory of the stock market, utilizing either manual or technical methodologies to optimize profitability. Recent advancements in machine learning technologies have showcased their efficacy in discerning authentic profit signals within the realm of stock trend forecasting, predominantly employing temporal data derived from historical stock price patterns. Nevertheless, the inherently volatile and dynamic characteristics of the stock market render the learning and capture of multi-scale temporal dependencies and stable trading opportunities a formidable challenge. This predicament is primarily attributed to the difficulty in distinguishing real profit signal patterns amidst a plethora of mixed, noisy data. In response to these complexities, we propose a Multi-Scale Temporal Memory Learning and Efficient Debiasing (MTMD) model. This innovative approach encompasses the creation of a learnable embedding coupled with external attention, serving as a memory module through self-similarity. It aims to mitigate noise interference and bolster temporal consistency within the model. The MTMD model adeptly amalgamates comprehensive local data at each timestamp while concurrently focusing on salient historical patterns on a global scale. Furthermore, the incorporation of a graph network, tailored to assimilate global and local information, facilitates the adaptive fusion of heterogeneous multi-scale data. Rigorous ablation studies and experimental evaluations affirm that the MTMD model surpasses contemporary state-of-the-art methodologies by a substantial margin in benchmark datasets. The source code can be found at https://github.com/MingjieWang0606/MDMT-Public.
