Table of Contents
Fetching ...

Better with Less: Small Proprietary Models Surpass Large Language Models in Financial Transaction Understanding

Wanying Ding, Savinay Narendra, Xiran Shi, Adwait Ratnaparkhi, Chengrui Yang, Nikoo Sabzevar, Ziyan Yin

TL;DR

The paper investigates how small proprietary Transformer models compare to large pretrained LLMs for financial transaction understanding, evaluating Encoder-Only, Decoder-Only, and Encoder-Decoder architectures across pretrained LLMs and proprietary variants. Through extensive datasets and a weighted evaluation, they show that domain-tailored proprietary models can achieve comparable or superior accuracy with far lower parameter counts and latency, enabling real-time, cost-efficient transaction processing. A proprietary Decoder-Only model, trained specifically for transaction data, achieves a 14% increase in transaction coverage and over $13 million in annual cost savings when deployed, outperforming larger LLMs in practical deployment scenarios. The work argues for task- and domain-specific model selection, demonstrating substantial practical impact for real-time financial applications.

Abstract

Analyzing financial transactions is crucial for ensuring regulatory compliance, detecting fraud, and supporting decisions. The complexity of financial transaction data necessitates advanced techniques to extract meaningful insights and ensure accurate analysis. Since Transformer-based models have shown outstanding performance across multiple domains, this paper seeks to explore their potential in understanding financial transactions. This paper conducts extensive experiments to evaluate three types of Transformer models: Encoder-Only, Decoder-Only, and Encoder-Decoder models. For each type, we explore three options: pretrained LLMs, fine-tuned LLMs, and small proprietary models developed from scratch. Our analysis reveals that while LLMs, such as LLaMA3-8b, Flan-T5, and SBERT, demonstrate impressive capabilities in various natural language processing tasks, they do not significantly outperform small proprietary models in the specific context of financial transaction understanding. This phenomenon is particularly evident in terms of speed and cost efficiency. Proprietary models, tailored to the unique requirements of transaction data, exhibit faster processing times and lower operational costs, making them more suitable for real-time applications in the financial sector. Our findings highlight the importance of model selection based on domain-specific needs and underscore the potential advantages of customized proprietary models over general-purpose LLMs in specialized applications. Ultimately, we chose to implement a proprietary decoder-only model to handle the complex transactions that we previously couldn't manage. This model can help us to improve 14% transaction coverage, and save more than \$13 million annual cost.

Better with Less: Small Proprietary Models Surpass Large Language Models in Financial Transaction Understanding

TL;DR

The paper investigates how small proprietary Transformer models compare to large pretrained LLMs for financial transaction understanding, evaluating Encoder-Only, Decoder-Only, and Encoder-Decoder architectures across pretrained LLMs and proprietary variants. Through extensive datasets and a weighted evaluation, they show that domain-tailored proprietary models can achieve comparable or superior accuracy with far lower parameter counts and latency, enabling real-time, cost-efficient transaction processing. A proprietary Decoder-Only model, trained specifically for transaction data, achieves a 14% increase in transaction coverage and over $13 million in annual cost savings when deployed, outperforming larger LLMs in practical deployment scenarios. The work argues for task- and domain-specific model selection, demonstrating substantial practical impact for real-time financial applications.

Abstract

Analyzing financial transactions is crucial for ensuring regulatory compliance, detecting fraud, and supporting decisions. The complexity of financial transaction data necessitates advanced techniques to extract meaningful insights and ensure accurate analysis. Since Transformer-based models have shown outstanding performance across multiple domains, this paper seeks to explore their potential in understanding financial transactions. This paper conducts extensive experiments to evaluate three types of Transformer models: Encoder-Only, Decoder-Only, and Encoder-Decoder models. For each type, we explore three options: pretrained LLMs, fine-tuned LLMs, and small proprietary models developed from scratch. Our analysis reveals that while LLMs, such as LLaMA3-8b, Flan-T5, and SBERT, demonstrate impressive capabilities in various natural language processing tasks, they do not significantly outperform small proprietary models in the specific context of financial transaction understanding. This phenomenon is particularly evident in terms of speed and cost efficiency. Proprietary models, tailored to the unique requirements of transaction data, exhibit faster processing times and lower operational costs, making them more suitable for real-time applications in the financial sector. Our findings highlight the importance of model selection based on domain-specific needs and underscore the potential advantages of customized proprietary models over general-purpose LLMs in specialized applications. Ultimately, we chose to implement a proprietary decoder-only model to handle the complex transactions that we previously couldn't manage. This model can help us to improve 14% transaction coverage, and save more than \$13 million annual cost.

Paper Structure

This paper contains 25 sections, 3 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Example for Transaction Understanding
  • Figure 2: Open Sourced LLMs vs. Proprietary Small Transformers on Task Accuracy
  • Figure 3: Models' performance on with different tokenizers and vocabulary size
  • Figure 4: Models' performances with different Model Size
  • Figure 5: Model Deployment