Temporal Representation Learning for Stock Similarities and Its Applications in Investment Management
Yoontae Hwang, Stefan Zohren, Yongjae Lee
TL;DR
SimStock introduces a temporal self-supervised learning framework that blends SSL with temporal domain generalization to learn stock representations robust to market non-stationarity. By constructing a temporal feature variant with moving averages, static metadata embeddings, and a dimension corruption augmentation, it learns embeddings via a triplet loss and attention-based module to identify similar stocks across and within exchanges. The approach yields state-of-the-art performance in finding similar stocks and translates into practical gains for pairs trading, index tracking of thematic ETFs, and portfolio optimization, outperforming traditional covariances and existing SSL baselines across multiple markets. These results highlight the potential of data-driven, temporally aware representations to enhance investment decision-making and risk management in a dynamic global financial landscape.
Abstract
In the era of rapid globalization and digitalization, accurate identification of similar stocks has become increasingly challenging due to the non-stationary nature of financial markets and the ambiguity in conventional regional and sector classifications. To address these challenges, we examine SimStock, a novel temporal self-supervised learning framework that combines techniques from self-supervised learning (SSL) and temporal domain generalization to learn robust and informative representations of financial time series data. The primary focus of our study is to understand the similarities between stocks from a broader perspective, considering the complex dynamics of the global financial landscape. We conduct extensive experiments on four real-world datasets with thousands of stocks and demonstrate the effectiveness of SimStock in finding similar stocks, outperforming existing methods. The practical utility of SimStock is showcased through its application to various investment strategies, such as pairs trading, index tracking, and portfolio optimization, where it leads to superior performance compared to conventional methods. Our findings empirically examine the potential of data-driven approach to enhance investment decision-making and risk management practices by leveraging the power of temporal self-supervised learning in the face of the ever-changing global financial landscape.
