Table of Contents
Fetching ...

FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation

Song Jin, Shuqi Li, Shukun Zhang, Rui Yan

TL;DR

This work formally defines Equity Research Report (ERR) generation and introduces FinRpt, an open benchmark with 6,825 ERRs derived from 800 CSI800 stocks across multiple data sources. It presents FinRpt-Gen, a nine-agent, multi-task framework trained via Supervised Fine-Tuning (LoRA) and Reinforcement Learning (DAPO) to produce coherent ERRs with six sections. The authors develop a comprehensive evaluation system combining basic NLP metrics and domain-specific LLM assessments, demonstrating data quality and strong performance that rivals leading models. The open-source nature of FinRpt and the accompanying pipeline aims to standardize ERR generation, accelerate research, and enable scalable, automated financial reporting.

Abstract

While LLMs have shown great success in financial tasks like stock prediction and question answering, their application in fully automating Equity Research Report generation remains uncharted territory. In this paper, we formulate the Equity Research Report (ERR) Generation task for the first time. To address the data scarcity and the evaluation metrics absence, we present an open-source evaluation benchmark for ERR generation - FinRpt. We frame a Dataset Construction Pipeline that integrates 7 financial data types and produces a high-quality ERR dataset automatically, which could be used for model training and evaluation. We also introduce a comprehensive evaluation system including 11 metrics to assess the generated ERRs. Moreover, we propose a multi-agent framework specifically tailored to address this task, named FinRpt-Gen, and train several LLM-based agents on the proposed datasets using Supervised Fine-Tuning and Reinforcement Learning. Experimental results indicate the data quality and metrics effectiveness of the benchmark FinRpt and the strong performance of FinRpt-Gen, showcasing their potential to drive innovation in the ERR generation field. All code and datasets are publicly available.

FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation

TL;DR

This work formally defines Equity Research Report (ERR) generation and introduces FinRpt, an open benchmark with 6,825 ERRs derived from 800 CSI800 stocks across multiple data sources. It presents FinRpt-Gen, a nine-agent, multi-task framework trained via Supervised Fine-Tuning (LoRA) and Reinforcement Learning (DAPO) to produce coherent ERRs with six sections. The authors develop a comprehensive evaluation system combining basic NLP metrics and domain-specific LLM assessments, demonstrating data quality and strong performance that rivals leading models. The open-source nature of FinRpt and the accompanying pipeline aims to standardize ERR generation, accelerate research, and enable scalable, automated financial reporting.

Abstract

While LLMs have shown great success in financial tasks like stock prediction and question answering, their application in fully automating Equity Research Report generation remains uncharted territory. In this paper, we formulate the Equity Research Report (ERR) Generation task for the first time. To address the data scarcity and the evaluation metrics absence, we present an open-source evaluation benchmark for ERR generation - FinRpt. We frame a Dataset Construction Pipeline that integrates 7 financial data types and produces a high-quality ERR dataset automatically, which could be used for model training and evaluation. We also introduce a comprehensive evaluation system including 11 metrics to assess the generated ERRs. Moreover, we propose a multi-agent framework specifically tailored to address this task, named FinRpt-Gen, and train several LLM-based agents on the proposed datasets using Supervised Fine-Tuning and Reinforcement Learning. Experimental results indicate the data quality and metrics effectiveness of the benchmark FinRpt and the strong performance of FinRpt-Gen, showcasing their potential to drive innovation in the ERR generation field. All code and datasets are publicly available.

Paper Structure

This paper contains 37 sections, 7 equations, 6 figures, 16 tables.

Figures (6)

  • Figure 1: The Dataset Construction Pipeline, Data Collection Module, and the Dataset Enhancement Module.
  • Figure 2: The proportion of reports from different industries of the FinRpt dataset.
  • Figure 3: The framework of the proposed FinRpt-Gen.
  • Figure 4: The performance comparison under the LLM evaluation metrics.
  • Figure 5: The average input tokens, output tokens, and costs for each agent. (The price calculation is based on the official API price of OpenAI as of February 1, 2025.)
  • ...and 1 more figures