Instruction-Based Fine-tuning of Open-Source LLMs for Predicting Customer Purchase Behaviors
Halil Ibrahim Ergul, Selim Balcisoy, Burcin Bozkaya
TL;DR
The study investigates predicting the next merchant category from financial transactions using instruction-tuned open-source LLMs, comparing a probabilistic baseline, CNN, LSTM, and finetuned LLMs trained via LoRA on natural-language representations of transactions. Bank A provides the training data while Bank B offers unseen test data to assess generalization, with Mistral-7B-Instruct-v0.2 delivering the strongest performance (weighted F1 up to 0.66 on last-9 inputs) and robust class-specific accuracy, including minority categories like Clothing and Gas Stations. The results highlight the value of semantic understanding from instruction-tuned LLMs and their ability to handle imbalanced transaction classes better than traditional sequential models, demonstrating concrete gains across multiple sequence lengths and categories. This work suggests practical applications for targeted marketing and personalized financial services, enabled by open-source LLMs fine-tuned with a PEFT approach.
Abstract
In this study, the performance of various predictive models, including probabilistic baseline, CNN, LSTM, and finetuned LLMs, in forecasting merchant categories from financial transaction data have been evaluated. Utilizing datasets from Bank A for training and Bank B for testing, the superior predictive capabilities of the fine-tuned Mistral Instruct model, which was trained using customer data converted into natural language format have been demonstrated. The methodology of this study involves instruction fine-tuning Mistral via LoRA (LowRank Adaptation of Large Language Models) to adapt its vast pre-trained knowledge to the specific domain of financial transactions. The Mistral model significantly outperforms traditional sequential models, achieving higher F1 scores in the three key merchant categories of bank transaction data (grocery, clothing, and gas stations) that is crucial for targeted marketing campaigns. This performance is attributed to the model's enhanced semantic understanding and adaptability which enables it to better manage minority classes and predict transaction categories with greater accuracy. These findings highlight the potential of LLMs in predicting human behavior.
