Table of Contents
Fetching ...

LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard

Varun Rao, Youran Sun, Mahendra Kumar, Tejas Mutneja, Agastya Mukherjee, Haizhao Yang

TL;DR

This paper studies applying LLMs to finance via the Open FinLLM Leaderboard and presents a fine-tuning pipeline that combines supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL) with synthetic data to boost performance on $36$ financial tasks. SFT is followed by DPO to curb repetition, and RL with data synthesis is used when task data are scarce, with iterative refinements possible. The authors demonstrate substantial gains across tasks and uncover a financial data scaling law with a critical exponent around $\alpha \approx 0.28$, suggesting cross-domain universality. They discuss practical constraints on preprocessing/postprocessing and outline future directions toward larger models, more task-specific prompts, and extended RL iterations for broader financial deployment.

Abstract

This paper investigates the application of large language models (LLMs) to financial tasks. We fine-tuned foundation models using the Open FinLLM Leaderboard as a benchmark. Building on Qwen2.5 and Deepseek-R1, we employed techniques including supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL) to enhance their financial capabilities. The fine-tuned models demonstrated substantial performance gains across a wide range of financial tasks. Moreover, we measured the data scaling law in the financial domain. Our work demonstrates the potential of large language models (LLMs) in financial applications.

LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard

TL;DR

This paper studies applying LLMs to finance via the Open FinLLM Leaderboard and presents a fine-tuning pipeline that combines supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL) with synthetic data to boost performance on financial tasks. SFT is followed by DPO to curb repetition, and RL with data synthesis is used when task data are scarce, with iterative refinements possible. The authors demonstrate substantial gains across tasks and uncover a financial data scaling law with a critical exponent around , suggesting cross-domain universality. They discuss practical constraints on preprocessing/postprocessing and outline future directions toward larger models, more task-specific prompts, and extended RL iterations for broader financial deployment.

Abstract

This paper investigates the application of large language models (LLMs) to financial tasks. We fine-tuned foundation models using the Open FinLLM Leaderboard as a benchmark. Building on Qwen2.5 and Deepseek-R1, we employed techniques including supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL) to enhance their financial capabilities. The fine-tuned models demonstrated substantial performance gains across a wide range of financial tasks. Moreover, we measured the data scaling law in the financial domain. Our work demonstrates the potential of large language models (LLMs) in financial applications.

Paper Structure

This paper contains 18 sections, 3 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Training flowchart showing the progression from base model to final model through SFT, DPO, and RL stages.
  • Figure 2: The data scaling law on financial tasks.