FISCAL: Financial Synthetic Claim-document Augmented Learning for Efficient Fact-Checking
Rishab Sharma, Iman Saberi, Elham Alipour, Jie JW Wu, Fatemeh Fard
TL;DR
This work tackles the dual challenge of factual reliability and computational cost in financial AI by introducing FISCAL, a modular synthetic data generator, and MiniCheck-FISCAL, a lightweight 7B fact-checker trained on domain-specific synthetic data (FISCAL-Data). The approach combines a Modular Claim–Document Generator with six augmentation strategies to produce large, diverse training triplets, validated through multi-LLM adjudication (Cohen’s kappa = 0.892) to ensure atomic, verifiable claims. Fine-tuning uses LoRA in a CLM setup, with a simple, interpretable single-token decision and a probabilistic confidence score $C_i$ threshold by $\tau$, enabling fast, reliable inference. Across internal benchmarks and external datasets FinDVer and Fin-Fact, MiniCheck-FISCAL matches or surpasses many larger models, illustrating that domain-focused synthetic data plus efficient fine-tuning can deliver state-of-the-art performance at a fraction of the cost. The results imply practical deployment benefits for finance-focused AI, with ablation studies underscoring the importance of paraphrasing and contradiction-insertion modules for robust factual verification.
Abstract
Financial applications of large language models (LLMs) require factual reliability and computational efficiency, yet current systems often hallucinate details and depend on prohibitively large models. We propose FISCAL (Financial Synthetic Claim-Document Augmented Learning), a modular framework for generating synthetic data tailored to financial fact-checking. Using FISCAL, we generate a dataset called FISCAL-data and use it to train MiniCheck-FISCAL, a lightweight verifier for numerical financial claims. MiniCheck-FISCAL outperforms its baseline, surpasses GPT-3.5 Turbo and other open-source peers of similar size, and approaches the accuracy of much larger systems (20x), such as Mixtral-8x22B and Command R+. On external datasets FinDVer and Fin-Fact, it rivals GPT-4o and Claude-3.5 while outperforming Gemini-1.5 Flash. These results show that domain-specific synthetic data, combined with efficient fine-tuning, enables compact models to achieve state-of-the-art accuracy, robustness, and scalability for practical financial AI. The dataset and scripts are available in the project repository (link provided in the paper).
