Table of Contents
Fetching ...

FANAL -- Financial Activity News Alerting Language Modeling Framework

Urjitkumar Patel, Fang-Chun Yeh, Chinmay Gondhalekar, Hari Nalluri

TL;DR

FANAL, a finance-focused, BERT-based framework, tackles real-time financial news categorization across 12 categories using a silver-labeling pipeline (XGB) and a novel ORPO-tuned variant (ORBERT) to address class imbalance. It combines cross-entropy and ORPO loss functions with parameter-efficient fine-tuning (LoRA) and an Entity Relevance Module to balance performance and efficiency. Compared against GPT-4o, Llama-3.1, and Phi-3, FANAL demonstrates competitive accuracy with substantially lower costs and faster inference, particularly in resource-constrained environments. The work contributes a practical blueprint for finance NLP that emphasizes data efficiency, calibration, and real-time applicability, with potential for broader adoption in financial intelligence and risk analysis.

Abstract

In the rapidly evolving financial sector, the accurate and timely interpretation of market news is essential for stakeholders needing to navigate unpredictable events. This paper introduces FANAL (Financial Activity News Alerting Language Modeling Framework), a specialized BERT-based framework engineered for real-time financial event detection and analysis, categorizing news into twelve distinct financial categories. FANAL leverages silver-labeled data processed through XGBoost and employs advanced fine-tuning techniques, alongside ORBERT (Odds Ratio BERT), a novel variant of BERT fine-tuned with ORPO (Odds Ratio Preference Optimization) for superior class-wise probability calibration and alignment with financial event relevance. We evaluate FANAL's performance against leading large language models, including GPT-4o, Llama-3.1 8B, and Phi-3, demonstrating its superior accuracy and cost efficiency. This framework sets a new standard for financial intelligence and responsiveness, significantly outstripping existing models in both performance and affordability.

FANAL -- Financial Activity News Alerting Language Modeling Framework

TL;DR

FANAL, a finance-focused, BERT-based framework, tackles real-time financial news categorization across 12 categories using a silver-labeling pipeline (XGB) and a novel ORPO-tuned variant (ORBERT) to address class imbalance. It combines cross-entropy and ORPO loss functions with parameter-efficient fine-tuning (LoRA) and an Entity Relevance Module to balance performance and efficiency. Compared against GPT-4o, Llama-3.1, and Phi-3, FANAL demonstrates competitive accuracy with substantially lower costs and faster inference, particularly in resource-constrained environments. The work contributes a practical blueprint for finance NLP that emphasizes data efficiency, calibration, and real-time applicability, with potential for broader adoption in financial intelligence and risk analysis.

Abstract

In the rapidly evolving financial sector, the accurate and timely interpretation of market news is essential for stakeholders needing to navigate unpredictable events. This paper introduces FANAL (Financial Activity News Alerting Language Modeling Framework), a specialized BERT-based framework engineered for real-time financial event detection and analysis, categorizing news into twelve distinct financial categories. FANAL leverages silver-labeled data processed through XGBoost and employs advanced fine-tuning techniques, alongside ORBERT (Odds Ratio BERT), a novel variant of BERT fine-tuned with ORPO (Odds Ratio Preference Optimization) for superior class-wise probability calibration and alignment with financial event relevance. We evaluate FANAL's performance against leading large language models, including GPT-4o, Llama-3.1 8B, and Phi-3, demonstrating its superior accuracy and cost efficiency. This framework sets a new standard for financial intelligence and responsiveness, significantly outstripping existing models in both performance and affordability.

Paper Structure

This paper contains 41 sections, 20 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: ORBERT Architecture
  • Figure 2: Illustration of training and validation loss over 10 epochs
  • Figure 4: Recall Improves with Prompt Engineering
  • Figure 5: Fine-Tuned ORBERT vs Fine-Tuned Base BERT