Table of Contents
Fetching ...

Enhanced LFTSformer: A Novel Long-Term Financial Time Series Prediction Model Using Advanced Feature Engineering and the DS Encoder Informer Architecture

Jianan Zhang, Hongyi Duan

TL;DR

This work tackles long-term financial time-series forecasting by proposing Enhanced LFTSformer, a hybrid model that fuses VMD-MIC-based feature engineering with a DS Encoder Informer architecture. The approach leverages a Stacked Informer and Distributed Sparse Attention, coupled with GC-Enhanced Adam optimization and a Dynamic Loss Function, to improve accuracy, adaptability, and generalization on benchmark stock datasets. Key contributions include a MIC-guided VMD framework for optimal decomposition, a temporal embedding and multi-scale encoder design for long-horizon dependencies, and an adaptive training strategy that enhances robustness to market volatility. Empirical results on 22 Chinese stocks show superior performance and stable predictions, though small-cap stocks present challenges, highlighting practical significance for risk management and asset allocation. The study also outlines avenues for future work, including event-based feature augmentation and non-linear feature engineering to further refine predictive efficacy.

Abstract

This study presents a groundbreaking model for forecasting long-term financial time series, termed the Enhanced LFTSformer. The model distinguishes itself through several significant innovations: (1) VMD-MIC+FE Feature Engineering: The incorporation of sophisticated feature engineering techniques, specifically through the integration of Variational Mode Decomposition (VMD), Maximal Information Coefficient (MIC), and feature engineering (FE) methods, enables comprehensive perception and extraction of deep-level features from complex and variable financial datasets. (2) DS Encoder Informer: The architecture of the original Informer has been modified by adopting a Stacked Informer structure in the encoder, and an innovative introduction of a multi-head decentralized sparse attention mechanism, referred to as the Distributed Informer. This modification has led to a reduction in the number of attention blocks, thereby enhancing both the training accuracy and speed. (3) GC Enhanced Adam \& Dynamic Loss Function: The deployment of a Gradient Clipping-enhanced Adam optimization algorithm and a dynamic loss function represents a pioneering approach within the domain of financial time series prediction. This novel methodology optimizes model performance and adapts more dynamically to evolving data patterns. Systematic experimentation on a range of benchmark stock market datasets demonstrates that the Enhanced LFTSformer outperforms traditional machine learning models and other Informer-based architectures in terms of prediction accuracy, adaptability, and generality. Furthermore, the paper identifies potential avenues for future enhancements, with a particular focus on the identification and quantification of pivotal impacting events and news. This is aimed at further refining the predictive efficacy of the model.

Enhanced LFTSformer: A Novel Long-Term Financial Time Series Prediction Model Using Advanced Feature Engineering and the DS Encoder Informer Architecture

TL;DR

This work tackles long-term financial time-series forecasting by proposing Enhanced LFTSformer, a hybrid model that fuses VMD-MIC-based feature engineering with a DS Encoder Informer architecture. The approach leverages a Stacked Informer and Distributed Sparse Attention, coupled with GC-Enhanced Adam optimization and a Dynamic Loss Function, to improve accuracy, adaptability, and generalization on benchmark stock datasets. Key contributions include a MIC-guided VMD framework for optimal decomposition, a temporal embedding and multi-scale encoder design for long-horizon dependencies, and an adaptive training strategy that enhances robustness to market volatility. Empirical results on 22 Chinese stocks show superior performance and stable predictions, though small-cap stocks present challenges, highlighting practical significance for risk management and asset allocation. The study also outlines avenues for future work, including event-based feature augmentation and non-linear feature engineering to further refine predictive efficacy.

Abstract

This study presents a groundbreaking model for forecasting long-term financial time series, termed the Enhanced LFTSformer. The model distinguishes itself through several significant innovations: (1) VMD-MIC+FE Feature Engineering: The incorporation of sophisticated feature engineering techniques, specifically through the integration of Variational Mode Decomposition (VMD), Maximal Information Coefficient (MIC), and feature engineering (FE) methods, enables comprehensive perception and extraction of deep-level features from complex and variable financial datasets. (2) DS Encoder Informer: The architecture of the original Informer has been modified by adopting a Stacked Informer structure in the encoder, and an innovative introduction of a multi-head decentralized sparse attention mechanism, referred to as the Distributed Informer. This modification has led to a reduction in the number of attention blocks, thereby enhancing both the training accuracy and speed. (3) GC Enhanced Adam \& Dynamic Loss Function: The deployment of a Gradient Clipping-enhanced Adam optimization algorithm and a dynamic loss function represents a pioneering approach within the domain of financial time series prediction. This novel methodology optimizes model performance and adapts more dynamically to evolving data patterns. Systematic experimentation on a range of benchmark stock market datasets demonstrates that the Enhanced LFTSformer outperforms traditional machine learning models and other Informer-based architectures in terms of prediction accuracy, adaptability, and generality. Furthermore, the paper identifies potential avenues for future enhancements, with a particular focus on the identification and quantification of pivotal impacting events and news. This is aimed at further refining the predictive efficacy of the model.
Paper Structure (43 sections, 28 equations, 14 figures, 10 tables)

This paper contains 43 sections, 28 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: The input embedding for the Informer model consists of three distinct components: scalar projection, local timestamp embeddings (positional), and global timestamp embeddings.
  • Figure 2: Stacked Informer Structure with 4 Layers.
  • Figure 4: Relationship Between Different Ks and Their Corresponding MICyy.
  • Figure 5: The Decomposition Results For Each Feature: IMFs Values for Each K.
  • Figure 6: The Dcomposition Results For Each Feature: Power Spectral Density (PSD) for Each Corresponding IMFs.
  • ...and 9 more figures