Cross-Sectional Asset Retrieval via Future-Aligned Soft Contrastive Learning
Hyeongmin Lee, Chanyeol Choi, Jihoon Kwon, Yoon Kim, Alejandro Lopez-Lira, Wonbin Ahn, Yongjae Lee
TL;DR
This work reframes asset retrieval as a future-aligned task, proposing Future-Aligned Soft Contrastive Learning (FASCL) that uses a soft contrastive loss based on pairwise future return correlations to shape embeddings. A patch-based Transformer encoder maps historical windows to embeddings, and retrieval is performed via cosine similarity in the learned space. The authors establish a rigorous evaluation protocol with four metrics (Trend Consistency, Future Return Correlation, Information Coefficient, Sector Precision) and demonstrate state-of-the-art performance across 4,229 US equities against 13 baselines, with strong translation to a spread-trading downstream task. The work provides a scalable, explainable approach to asset retrieval and sets up a standardized benchmark for future research in future-aligned cross-sectional similarity.
Abstract
Asset retrieval--finding similar assets in a financial universe--is central to quantitative investment decision-making. Existing approaches define similarity through historical price patterns or sector classifications, but such backward-looking criteria provide no guarantee about future behavior. We argue that effective asset retrieval should be future-aligned: the retrieved assets should be those most likely to exhibit correlated future returns. To this end, we propose Future-Aligned Soft Contrastive Learning (FASCL), a representation learning framework whose soft contrastive loss uses pairwise future return correlations as continuous supervision targets. We further introduce an evaluation protocol designed to directly assess whether retrieved assets share similar future trajectories. Experiments on 4,229 US equities demonstrate that FASCL consistently outperforms 13 baselines across all future-behavior metrics. The source code will be available soon.
