SETN: Stock Embedding Enhanced with Textual and Network Information
Takehiro Takayanagi, Hiroki Sakaji, Kiyoshi Izumi
TL;DR
SETN tackles the challenge of producing stock embeddings that integrate both textual descriptions and cross-stock network information. It advances by domain-adapting transformers to financial text, and by fusing their outputs with GNN-based graph embeddings via a residual connection in an end-to-end, multitask framework that predicts sector and industry labels. Empirical results on the Japanese market show improved related company information extraction and thematic fund creation over baselines, with ablations supporting the benefits of directed graphs and joint training. The approach offers a practical pathway for wealth-management tasks that require nuanced, context-rich stock representations.
Abstract
Stock embedding is a method for vector representation of stocks. There is a growing demand for vector representations of stock, i.e., stock embedding, in wealth management sectors, and the method has been applied to various tasks such as stock price prediction, portfolio optimization, and similar fund identifications. Stock embeddings have the advantage of enabling the quantification of relative relationships between stocks, and they can extract useful information from unstructured data such as text and network data. In this study, we propose stock embedding enhanced with textual and network information (SETN) using a domain-adaptive pre-trained transformer-based model to embed textual information and a graph neural network model to grasp network information. We evaluate the performance of our proposed model on related company information extraction tasks. We also demonstrate that stock embeddings obtained from the proposed model perform better in creating thematic funds than those obtained from baseline methods, providing a promising pathway for various applications in the wealth management industry.
