Table of Contents
Fetching ...

FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction

Yitong Duan, Weiran Wang, Jian Li

TL;DR

FactorGCL tackles the challenge of extracting effective data-driven factors for stock return prediction under high market noise. It introduces a hypergraph-based factor model with a cascading residual architecture to separate prior beta, hidden beta, and alpha components, and couples this with temporal residual contrastive learning to ensure hidden factors are both effective and comprehensive. The approach achieves state-of-the-art predictive performance and profitable investment simulations on real market data, with ablations validating each module's contribution. This framework offers a principled path for uncovering meaningful risk factors and informing portfolio decisions in finance.

Abstract

As a fundamental method in economics and finance, the factor model has been extensively utilized in quantitative investment. In recent years, there has been a paradigm shift from traditional linear models with expert-designed factors to more flexible nonlinear machine learning-based models with data-driven factors, aiming to enhance the effectiveness of these factor models. However, due to the low signal-to-noise ratio in market data, mining effective factors in data-driven models remains challenging. In this work, we propose a hypergraph-based factor model with temporal residual contrastive learning (FactorGCL) that employs a hypergraph structure to better capture high-order nonlinear relationships among stock returns and factors. To mine hidden factors that supplement human-designed prior factors for predicting stock returns, we design a cascading residual hypergraph architecture, in which the hidden factors are extracted from the residual information after removing the influence of prior factors. Additionally, we propose a temporal residual contrastive learning method to guide the extraction of effective and comprehensive hidden factors by contrasting stock-specific residual information over different time periods. Our extensive experiments on real stock market data demonstrate that FactorGCL not only outperforms existing state-of-the-art methods but also mines effective hidden factors for predicting stock returns.

FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction

TL;DR

FactorGCL tackles the challenge of extracting effective data-driven factors for stock return prediction under high market noise. It introduces a hypergraph-based factor model with a cascading residual architecture to separate prior beta, hidden beta, and alpha components, and couples this with temporal residual contrastive learning to ensure hidden factors are both effective and comprehensive. The approach achieves state-of-the-art predictive performance and profitable investment simulations on real market data, with ablations validating each module's contribution. This framework offers a principled path for uncovering meaningful risk factors and informing portfolio decisions in finance.

Abstract

As a fundamental method in economics and finance, the factor model has been extensively utilized in quantitative investment. In recent years, there has been a paradigm shift from traditional linear models with expert-designed factors to more flexible nonlinear machine learning-based models with data-driven factors, aiming to enhance the effectiveness of these factor models. However, due to the low signal-to-noise ratio in market data, mining effective factors in data-driven models remains challenging. In this work, we propose a hypergraph-based factor model with temporal residual contrastive learning (FactorGCL) that employs a hypergraph structure to better capture high-order nonlinear relationships among stock returns and factors. To mine hidden factors that supplement human-designed prior factors for predicting stock returns, we design a cascading residual hypergraph architecture, in which the hidden factors are extracted from the residual information after removing the influence of prior factors. Additionally, we propose a temporal residual contrastive learning method to guide the extraction of effective and comprehensive hidden factors by contrasting stock-specific residual information over different time periods. Our extensive experiments on real stock market data demonstrate that FactorGCL not only outperforms existing state-of-the-art methods but also mines effective hidden factors for predicting stock returns.

Paper Structure

This paper contains 23 sections, 11 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Stock price trends vary across different sectors and industries. During the COVID-19 pandemic, the stock price trends of the electronic consumer and medical industries exhibited high correlations, which are insufficiently explained using human-designed industry factors.
  • Figure 2: Brief illustration of FactorGCL.
  • Figure 3: Overview of the cascading residual hypergraph architecture in FactorGCL. Stock returns are decomposed into prior beta, hidden beta, and individual alpha components. Each component is extracted from the residuals after removing the influence of the preceding component.
  • Figure 4: Illustration of the information propagation process in the HyperGCN. The HyperGCN can model the nonlinear influence of factors on stocks by aggregating information from stock nodes connected by the same hyperedge.
  • Figure 5: Illustration of the temporal residual contrastive learning method. The model contrasts the past and future alpha embeddings of the same stock as positive pairs, and the embeddings of different stocks as negative pairs.
  • ...and 2 more figures