FANAL -- Financial Activity News Alerting Language Modeling Framework
Urjitkumar Patel, Fang-Chun Yeh, Chinmay Gondhalekar, Hari Nalluri
TL;DR
FANAL, a finance-focused, BERT-based framework, tackles real-time financial news categorization across 12 categories using a silver-labeling pipeline (XGB) and a novel ORPO-tuned variant (ORBERT) to address class imbalance. It combines cross-entropy and ORPO loss functions with parameter-efficient fine-tuning (LoRA) and an Entity Relevance Module to balance performance and efficiency. Compared against GPT-4o, Llama-3.1, and Phi-3, FANAL demonstrates competitive accuracy with substantially lower costs and faster inference, particularly in resource-constrained environments. The work contributes a practical blueprint for finance NLP that emphasizes data efficiency, calibration, and real-time applicability, with potential for broader adoption in financial intelligence and risk analysis.
Abstract
In the rapidly evolving financial sector, the accurate and timely interpretation of market news is essential for stakeholders needing to navigate unpredictable events. This paper introduces FANAL (Financial Activity News Alerting Language Modeling Framework), a specialized BERT-based framework engineered for real-time financial event detection and analysis, categorizing news into twelve distinct financial categories. FANAL leverages silver-labeled data processed through XGBoost and employs advanced fine-tuning techniques, alongside ORBERT (Odds Ratio BERT), a novel variant of BERT fine-tuned with ORPO (Odds Ratio Preference Optimization) for superior class-wise probability calibration and alignment with financial event relevance. We evaluate FANAL's performance against leading large language models, including GPT-4o, Llama-3.1 8B, and Phi-3, demonstrating its superior accuracy and cost efficiency. This framework sets a new standard for financial intelligence and responsiveness, significantly outstripping existing models in both performance and affordability.
