Table of Contents
Fetching ...

FinLLMs: A Framework for Financial Reasoning Dataset Generation with Large Language Models

Ziqiang Yuan, Kaiyuan Wang, Shoutai Zhu, Ye Yuan, Jingya Zhou, Yanlin Zhu, Wenqi Wei

TL;DR

FinLLMs presents a formula-driven framework for automatic generation of financial question-answering data by building a graph of financial formulas, extending it with temporal relations, and generating synthetic QA pairs using GPT-3.5. The method grounds questions and answers in domain formulas via a DSL program, enabling accurate question formulation and reliable computations from mixed tabular and textual sources. Across FinQA, TAT-QA, and FinLLMs-trained models, synthetic data improves execution and program accuracy by at least 2% and can outperform human-labeled baselines in some settings, with few-shot prompting further boosting performance. The work demonstrates scalable, cost-effective data synthesis for financial numerical reasoning and identifies future directions in fact filtering, privacy, and broader formula coverage.

Abstract

Large Language models (LLMs) usually rely on extensive training datasets. In the financial domain, creating numerical reasoning datasets that include a mix of tables and long text often involves substantial manual annotation expenses. To address the limited data resources and reduce the annotation cost, we introduce FinLLMs, a method for generating financial question-answering data based on common financial formulas using Large Language Models. First, we compile a list of common financial formulas and construct a graph based on the variables these formulas employ. We then augment the formula set by combining those that share identical variables as new elements. Specifically, we explore formulas obtained by manual annotation and merge those formulas with shared variables by traversing the constructed graph. Finally, utilizing GPT-3.5, we generate financial question-answering data that encompasses both tabular information and long textual content, building on the collected formula set. Our experiments demonstrate that synthetic data generated by FinLLMs effectively enhances the performance of several large-scale numerical reasoning models in the financial domain, outperforming two established benchmark financial question-answering datasets.

FinLLMs: A Framework for Financial Reasoning Dataset Generation with Large Language Models

TL;DR

FinLLMs presents a formula-driven framework for automatic generation of financial question-answering data by building a graph of financial formulas, extending it with temporal relations, and generating synthetic QA pairs using GPT-3.5. The method grounds questions and answers in domain formulas via a DSL program, enabling accurate question formulation and reliable computations from mixed tabular and textual sources. Across FinQA, TAT-QA, and FinLLMs-trained models, synthetic data improves execution and program accuracy by at least 2% and can outperform human-labeled baselines in some settings, with few-shot prompting further boosting performance. The work demonstrates scalable, cost-effective data synthesis for financial numerical reasoning and identifies future directions in fact filtering, privacy, and broader formula coverage.

Abstract

Large Language models (LLMs) usually rely on extensive training datasets. In the financial domain, creating numerical reasoning datasets that include a mix of tables and long text often involves substantial manual annotation expenses. To address the limited data resources and reduce the annotation cost, we introduce FinLLMs, a method for generating financial question-answering data based on common financial formulas using Large Language Models. First, we compile a list of common financial formulas and construct a graph based on the variables these formulas employ. We then augment the formula set by combining those that share identical variables as new elements. Specifically, we explore formulas obtained by manual annotation and merge those formulas with shared variables by traversing the constructed graph. Finally, utilizing GPT-3.5, we generate financial question-answering data that encompasses both tabular information and long textual content, building on the collected formula set. Our experiments demonstrate that synthetic data generated by FinLLMs effectively enhances the performance of several large-scale numerical reasoning models in the financial domain, outperforming two established benchmark financial question-answering datasets.
Paper Structure (30 sections, 6 equations, 8 figures, 6 tables, 3 algorithms)

This paper contains 30 sections, 6 equations, 8 figures, 6 tables, 3 algorithms.

Figures (8)

  • Figure 1: Example of generating FinQA data from formulas. Throughout the entire process, the large language model is involved in three main tasks. Firstly, it combines the dependent variables of the formula with the specified time range to generate a table. Then, it extracts the values of variables from the table based on the time points relevant to the questions. Finally, it generates text related to the table. We utilize the independent variables of the formula and random time points to generate questions through templates. The ultimate answers are generated using the formula and the values extracted from the table. It is important to note that in this example, the supporting facts for the questions are sourced from the table. However, when generating supporting facts from text, we first use the large language model and a randomly chosen time range to generate text. Subsequently, we generate tables unrelated to the formula variables to avoid contradictory information.
  • Figure 2: Overview of FinLLMs with three steps. (1) Graph Construction: We collect standard financial formulas, format them into DSL programs and construct a graph based on the variables involved. (2) Graph Extension: We augment the set of formulas by combining those containing the same variables as new elements. Specifically, we explore and merge those formulas by traversing the constructed graph. (3) Example Generation: We use GPT-3.5 to generate financial question-answering data containing tables and long texts according to the formula set.
  • Figure 3: This graph is composed of four formulas: ebit = total profit + interest expense, interest coverage ratio = ebit / interest expense, net profit = total profit - income tax expense, total profit = operating profit + non-operating income - non-operating expense. The nodes in the dotted box in the figure represent all the nodes whose independent variables contain the variables in the box.
  • Figure 4: We introduce the time dimension into the graph and add several formulas for each variable representing the change of variable value over time. The new formula in the figure indicates the calculation of the change of ebit between two time slices.
  • Figure 5: The traversal process. The example in the figure does not involve time issues. When traversing the graph and going from node 1 to node 2, we can generate node 3 by combining these two nodes.
  • ...and 3 more figures