Table of Contents
Fetching ...

FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting

Yifan Hu, Yuante Li, Peiyuan Liu, Yuxia Zhu, Naiqi Li, Tao Dai, Shu-tao Xia, Dawei Cheng, Changjun Jiang

TL;DR

FinTSB tackles key limitations in financial time series forecasting evaluation by introducing a movement-pattern taxonomy, tokenized data preprocessing, and a unified pipeline that standardizes metrics across ranking, portfolio, and error dimensions while incorporating real-world trading constraints. It constructs 20 diverse datasets across four movement types, enabling thorough cross-pattern and cross-market evaluation of six backbone families, including LLM-based approaches, with transfer learning showing robust zero-shot performance on the CSI 300 in 2024. The benchmark emphasizes practical relevance by modeling transaction costs and trading restrictions, and it demonstrates that no single method universally dominates, highlighting the need for diversified benchmarking and regime-aware model selection. Overall, FinTSB provides a robust platform for advancing FinTSF by improving diversity, standardization, and real-world applicability, with code available for reproducible research.

Abstract

Financial time series (FinTS) record the behavior of human-brain-augmented decision-making, capturing valuable historical information that can be leveraged for profitable investment strategies. Not surprisingly, this area has attracted considerable attention from researchers, who have proposed a wide range of methods based on various backbones. However, the evaluation of the area often exhibits three systemic limitations: 1. Failure to account for the full spectrum of stock movement patterns observed in dynamic financial markets. (Diversity Gap), 2. The absence of unified assessment protocols undermines the validity of cross-study performance comparisons. (Standardization Deficit), and 3. Neglect of critical market structure factors, resulting in inflated performance metrics that lack practical applicability. (Real-World Mismatch). Addressing these limitations, we propose FinTSB, a comprehensive and practical benchmark for financial time series forecasting (FinTSF). To increase the variety, we categorize movement patterns into four specific parts, tokenize and pre-process the data, and assess the data quality based on some sequence characteristics. To eliminate biases due to different evaluation settings, we standardize the metrics across three dimensions and build a user-friendly, lightweight pipeline incorporating methods from various backbones. To accurately simulate real-world trading scenarios and facilitate practical implementation, we extensively model various regulatory constraints, including transaction fees, among others. Finally, we conduct extensive experiments on FinTSB, highlighting key insights to guide model selection under varying market conditions. Overall, FinTSB provides researchers with a novel and comprehensive platform for improving and evaluating FinTSF methods. The code is available at https://github.com/TongjiFinLab/FinTSBenchmark.

FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting

TL;DR

FinTSB tackles key limitations in financial time series forecasting evaluation by introducing a movement-pattern taxonomy, tokenized data preprocessing, and a unified pipeline that standardizes metrics across ranking, portfolio, and error dimensions while incorporating real-world trading constraints. It constructs 20 diverse datasets across four movement types, enabling thorough cross-pattern and cross-market evaluation of six backbone families, including LLM-based approaches, with transfer learning showing robust zero-shot performance on the CSI 300 in 2024. The benchmark emphasizes practical relevance by modeling transaction costs and trading restrictions, and it demonstrates that no single method universally dominates, highlighting the need for diversified benchmarking and regime-aware model selection. Overall, FinTSB provides a robust platform for advancing FinTSF by improving diversity, standardization, and real-world applicability, with code available for reproducible research.

Abstract

Financial time series (FinTS) record the behavior of human-brain-augmented decision-making, capturing valuable historical information that can be leveraged for profitable investment strategies. Not surprisingly, this area has attracted considerable attention from researchers, who have proposed a wide range of methods based on various backbones. However, the evaluation of the area often exhibits three systemic limitations: 1. Failure to account for the full spectrum of stock movement patterns observed in dynamic financial markets. (Diversity Gap), 2. The absence of unified assessment protocols undermines the validity of cross-study performance comparisons. (Standardization Deficit), and 3. Neglect of critical market structure factors, resulting in inflated performance metrics that lack practical applicability. (Real-World Mismatch). Addressing these limitations, we propose FinTSB, a comprehensive and practical benchmark for financial time series forecasting (FinTSF). To increase the variety, we categorize movement patterns into four specific parts, tokenize and pre-process the data, and assess the data quality based on some sequence characteristics. To eliminate biases due to different evaluation settings, we standardize the metrics across three dimensions and build a user-friendly, lightweight pipeline incorporating methods from various backbones. To accurately simulate real-world trading scenarios and facilitate practical implementation, we extensively model various regulatory constraints, including transaction fees, among others. Finally, we conduct extensive experiments on FinTSB, highlighting key insights to guide model selection under varying market conditions. Overall, FinTSB provides researchers with a novel and comprehensive platform for improving and evaluating FinTSF methods. The code is available at https://github.com/TongjiFinLab/FinTSBenchmark.

Paper Structure

This paper contains 28 sections, 6 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: FinTSF methods classified by backbone architectures and their representative works.
  • Figure 2: In the field of financial time series forecasting, existing evaluation frameworks often face three issues: Diversity Gap, Standardization Deficit, and Real-World Mismatch.
  • Figure 3: Visualization of financial time series data with different movement patterns.
  • Figure 4: Hexbin plots illustrating the normalized density values of the low-dimensional feature spaces generated by PCA, applied to stock features such as open price, close price, high price, low price, and trading volume for FinTSB, alongside four different time-sliced stock data.
  • Figure 5: The pipeline of FinTSB with four integral modules.
  • ...and 2 more figures