Table of Contents
Fetching ...

Financial Knowledge Large Language Model

Cehao Yang, Chengjin Xu, Yiyan Qi

TL;DR

This work tackles the reliability and practicality gap of large language models in finance by introducing three integrated components. IDEA-FinBench provides a bilingual, exam-style benchmark (CPA/CFA) to rigorously evaluate financial knowledge in LLMs; IDEA-FinKER offers soft (retrieval-based few-shot) and hard (instruction-based fine-tuning) paradigms to rapidly adapt general LLMs to finance without costly pre-training; IDEA-FinQA builds a real-time, externally augmented QA system with dedicated data collection, search, and four LLM-driven agents to ensure factuality with sources. Empirical results show GPT-4 dominates many categories on FinBench, while FinKER typically yields stable gains for weaker models and combined injections offer the best improvements; FinQA reaches superior factual QA performance on FinFact. Collectively, the framework provides a scalable path to trustworthy, domain-specific financial AI capable of reasoning, numerical analysis, and fact-checking with auditable sources, enabling practical deployment in finance.

Abstract

Artificial intelligence is making significant strides in the finance industry, revolutionizing how data is processed and interpreted. Among these technologies, large language models (LLMs) have demonstrated substantial potential to transform financial services by automating complex tasks, enhancing customer service, and providing detailed financial analysis. Firstly, we introduce IDEA-FinBench, an evaluation benchmark specifically tailored for assessing financial knowledge in large language models (LLMs). This benchmark utilizes questions from two globally respected and authoritative financial professional exams, aimimg to comprehensively evaluate the capability of LLMs to directly address exam questions pertinent to the finance sector. Secondly, we propose IDEA-FinKER, a Financial Knowledge Enhancement framework designed to facilitate the rapid adaptation of general LLMs to the financial domain, introducing a retrieval-based few-shot learning method for real-time context-level knowledge injection, and a set of high-quality financial knowledge instructions for fine-tuning any general LLM. Finally, we present IDEA-FinQA, a financial question-answering system powered by LLMs. This system is structured around a scheme of real-time knowledge injection and factual enhancement using external knowledge. IDEA-FinQA is comprised of three main modules: the data collector, the data querying module, and LLM-based agents tasked with specific functions.

Financial Knowledge Large Language Model

TL;DR

This work tackles the reliability and practicality gap of large language models in finance by introducing three integrated components. IDEA-FinBench provides a bilingual, exam-style benchmark (CPA/CFA) to rigorously evaluate financial knowledge in LLMs; IDEA-FinKER offers soft (retrieval-based few-shot) and hard (instruction-based fine-tuning) paradigms to rapidly adapt general LLMs to finance without costly pre-training; IDEA-FinQA builds a real-time, externally augmented QA system with dedicated data collection, search, and four LLM-driven agents to ensure factuality with sources. Empirical results show GPT-4 dominates many categories on FinBench, while FinKER typically yields stable gains for weaker models and combined injections offer the best improvements; FinQA reaches superior factual QA performance on FinFact. Collectively, the framework provides a scalable path to trustworthy, domain-specific financial AI capable of reasoning, numerical analysis, and fact-checking with auditable sources, enabling practical deployment in finance.

Abstract

Artificial intelligence is making significant strides in the finance industry, revolutionizing how data is processed and interpreted. Among these technologies, large language models (LLMs) have demonstrated substantial potential to transform financial services by automating complex tasks, enhancing customer service, and providing detailed financial analysis. Firstly, we introduce IDEA-FinBench, an evaluation benchmark specifically tailored for assessing financial knowledge in large language models (LLMs). This benchmark utilizes questions from two globally respected and authoritative financial professional exams, aimimg to comprehensively evaluate the capability of LLMs to directly address exam questions pertinent to the finance sector. Secondly, we propose IDEA-FinKER, a Financial Knowledge Enhancement framework designed to facilitate the rapid adaptation of general LLMs to the financial domain, introducing a retrieval-based few-shot learning method for real-time context-level knowledge injection, and a set of high-quality financial knowledge instructions for fine-tuning any general LLM. Finally, we present IDEA-FinQA, a financial question-answering system powered by LLMs. This system is structured around a scheme of real-time knowledge injection and factual enhancement using external knowledge. IDEA-FinQA is comprised of three main modules: the data collector, the data querying module, and LLM-based agents tasked with specific functions.
Paper Structure (86 sections, 6 equations, 19 figures, 6 tables, 1 algorithm)

This paper contains 86 sections, 6 equations, 19 figures, 6 tables, 1 algorithm.

Figures (19)

  • Figure 1: Architecture of Transformer Model vaswani2017attention. The encoder (left) is stacked by multiple encoding layer, and the decoder (right) is stacked by multiple decoding layers.
  • Figure 2: Architecture of BERT Model devlin2018bert. It adopts unsupervised-learning during the pre-training stage (left), and adopts fine-tuning (right) to adapt the downstream tasks.
  • Figure 3: Architecture of GPT Model radford2018improving. After pre-training through NSP tasks on a large unsupervised text corpus, it can be adapted to downstream tasks.
  • Figure 4: The alignment of InstructGPT ouyang2022training. The first step one the left is supervised fine-tuning. The second step in the middle is the reward model training. The third step on the right is the reinforcement learning.
  • Figure 5: Base pre-trained language models (left) and financial-domain LLMs (right).
  • ...and 14 more figures