Contrastive Learning of Asset Embeddings from Financial Time Series
Rian Dolphin, Barry Smyth, Ruihai Dong
TL;DR
This work tackles learning informative asset representations from financial time series in the presence of market noise and nonstationarity. It introduces a contrastive learning framework that builds informative positive and negative samples from rolling window return similarities via a hypothesis-test based sampling strategy and evaluates three loss variants to shape the embedding space. Empirical results on industry sector classification and portfolio hedging demonstrate that the proposed embeddings outperform baselines, highlighting the practical value of self-supervised asset representations. The approach offers a data-driven means to uncover meaningful asset relationships that can enhance downstream financial analytics and decision making.
Abstract
Representation learning has emerged as a powerful paradigm for extracting valuable latent features from complex, high-dimensional data. In financial domains, learning informative representations for assets can be used for tasks like sector classification, and risk management. However, the complex and stochastic nature of financial markets poses unique challenges. We propose a novel contrastive learning framework to generate asset embeddings from financial time series data. Our approach leverages the similarity of asset returns over many subwindows to generate informative positive and negative samples, using a statistical sampling strategy based on hypothesis testing to address the noisy nature of financial data. We explore various contrastive loss functions that capture the relationships between assets in different ways to learn a discriminative representation space. Experiments on real-world datasets demonstrate the effectiveness of the learned asset embeddings on benchmark industry classification and portfolio optimization tasks. In each case our novel approaches significantly outperform existing baselines highlighting the potential for contrastive learning to capture meaningful and actionable relationships in financial data.
