FinBloom: Knowledge Grounding Large Language Model with Real-time Financial Data
Ankur Sinha, Chaitanya Agarwal, Pekka Malo
TL;DR
The paper tackles the problem that large language models struggle with real-time financial information by proposing a knowledge-grounding framework centered on a Financial Agent. FinBloom 7B, a domain-specific 7B LLM trained on extensive financial news and SEC filings, is fine-tuned to generate contextual data requests that drive a separate Data Module, enabling fast, context-rich responses from a larger LLM. A 50,000-item Financial Context Dataset and template-driven query generation underpin the Financial Agent, which demonstrates strong performance on the FinBen benchmark and reduces latency for high-velocity finance tasks. The approach significantly enhances real-time decision-making in finance and points toward future multi-agent extensions to broaden capability and impact.
Abstract
Large language models (LLMs) excel at generating human-like responses but often struggle with interactive tasks that require access to real-time information. This limitation poses challenges in finance, where models must access up-to-date information, such as recent news or price movements, to support decision-making. To address this, we introduce Financial Agent, a knowledge-grounding approach for LLMs to handle financial queries using real-time text and tabular data. Our contributions are threefold: First, we develop a Financial Context Dataset of over 50,000 financial queries paired with the required context. Second, we train FinBloom 7B, a custom 7 billion parameter LLM, on 14 million financial news articles from Reuters and Deutsche Presse-Agentur, alongside 12 million Securities and Exchange Commission (SEC) filings. Third, we fine-tune FinBloom 7B using the Financial Context Dataset to serve as a Financial Agent. This agent generates relevant financial context, enabling efficient real-time data retrieval to answer user queries. By reducing latency and eliminating the need for users to manually provide accurate data, our approach significantly enhances the capability of LLMs to handle dynamic financial tasks. Our proposed approach makes real-time financial decisions, algorithmic trading and other related tasks streamlined, and is valuable in contexts with high-velocity data flows.
