Financial Knowledge Large Language Model
Cehao Yang, Chengjin Xu, Yiyan Qi
TL;DR
This work tackles the reliability and practicality gap of large language models in finance by introducing three integrated components. IDEA-FinBench provides a bilingual, exam-style benchmark (CPA/CFA) to rigorously evaluate financial knowledge in LLMs; IDEA-FinKER offers soft (retrieval-based few-shot) and hard (instruction-based fine-tuning) paradigms to rapidly adapt general LLMs to finance without costly pre-training; IDEA-FinQA builds a real-time, externally augmented QA system with dedicated data collection, search, and four LLM-driven agents to ensure factuality with sources. Empirical results show GPT-4 dominates many categories on FinBench, while FinKER typically yields stable gains for weaker models and combined injections offer the best improvements; FinQA reaches superior factual QA performance on FinFact. Collectively, the framework provides a scalable path to trustworthy, domain-specific financial AI capable of reasoning, numerical analysis, and fact-checking with auditable sources, enabling practical deployment in finance.
Abstract
Artificial intelligence is making significant strides in the finance industry, revolutionizing how data is processed and interpreted. Among these technologies, large language models (LLMs) have demonstrated substantial potential to transform financial services by automating complex tasks, enhancing customer service, and providing detailed financial analysis. Firstly, we introduce IDEA-FinBench, an evaluation benchmark specifically tailored for assessing financial knowledge in large language models (LLMs). This benchmark utilizes questions from two globally respected and authoritative financial professional exams, aimimg to comprehensively evaluate the capability of LLMs to directly address exam questions pertinent to the finance sector. Secondly, we propose IDEA-FinKER, a Financial Knowledge Enhancement framework designed to facilitate the rapid adaptation of general LLMs to the financial domain, introducing a retrieval-based few-shot learning method for real-time context-level knowledge injection, and a set of high-quality financial knowledge instructions for fine-tuning any general LLM. Finally, we present IDEA-FinQA, a financial question-answering system powered by LLMs. This system is structured around a scheme of real-time knowledge injection and factual enhancement using external knowledge. IDEA-FinQA is comprised of three main modules: the data collector, the data querying module, and LLM-based agents tasked with specific functions.
