The LLM Pro Finance Suite: Multilingual Large Language Models for Financial Applications

Gaëtan Caillaut; Raheel Qader; Jingshu Liu; Mariam Nakhlé; Arezki Sadoune; Massinissa Ahmim; Jean-Gabriel Barthelemy

The LLM Pro Finance Suite: Multilingual Large Language Models for Financial Applications

Gaëtan Caillaut, Raheel Qader, Jingshu Liu, Mariam Nakhlé, Arezki Sadoune, Massinissa Ahmim, Jean-Gabriel Barthelemy

TL;DR

This work addresses the limits of generalist LLMs in finance by building the LLM Pro Finance Suite, a set of five instruction-tuned models from $8B$ to $70B$ that are trained on a multilingual, finance-heavy corpus with more than $50\%$ finance data. The authors augment generalist capabilities with domain-specific data collected via CPT and SFT pipelines and augment multilinguality with translation data, aiming to preserve broad task performance while elevating financial reasoning, translation, and advisory capabilities. They introduce a comprehensive finance-focused benchmark suite, evaluate across general and finance tasks, and publicly release two $8B$ models to enable further research. Their results show consistent gains on financial tasks and translation without sacrificing general language abilities, while also examining RAG and toxicity with a candid discussion of evaluation limitations. The work thus provides a practical, open-source path toward robust, multilingual financial NLP with potential for retrieval-augmented and agentic finance workflows in real-world applications.

Abstract

The financial industry's growing demand for advanced natural language processing (NLP) capabilities has highlighted the limitations of generalist large language models (LLMs) in handling domain-specific financial tasks. To address this gap, we introduce the LLM Pro Finance Suite, a collection of five instruction-tuned LLMs (ranging from 8B to 70B parameters) specifically designed for financial applications. Our approach focuses on enhancing generalist instruction-tuned models, leveraging their existing strengths in instruction following, reasoning, and toxicity control, while fine-tuning them on a curated, high-quality financial corpus comprising over 50% finance-related data in English, French, and German. We evaluate the LLM Pro Finance Suite on a comprehensive financial benchmark suite, demonstrating consistent improvement over state-of-the-art baselines in finance-oriented tasks and financial translation. Notably, our models maintain the strong general-domain capabilities of their base models, ensuring reliable performance across non-specialized tasks. This dual proficiency, enhanced financial expertise without compromise on general abilities, makes the LLM Pro Finance Suite an ideal drop-in replacement for existing LLMs in financial workflows, offering improved domain-specific performance while preserving overall versatility. We publicly release two 8B-parameters models to foster future research and development in financial NLP applications: https://huggingface.co/collections/DragonLLM/llm-open-finance.

The LLM Pro Finance Suite: Multilingual Large Language Models for Financial Applications

TL;DR

Abstract

The LLM Pro Finance Suite: Multilingual Large Language Models for Financial Applications

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)