Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences

Piotr Skalski; David Sutton; Stuart Burrell; Iker Perez; Jason Wong

Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences

Piotr Skalski, David Sutton, Stuart Burrell, Iker Perez, Jason Wong

TL;DR

This work addresses the challenge of learning transferable representations from multivariate financial transaction sequences by proposing NPPR, a self-supervised pretraining framework that combines next-event prediction with past-reconstruction in a GRU-based encoder. The method yields contextualised embeddings that outperform hand-engineered features and prior SSL approaches on multiple downstream tasks, and, when trained on a large, diverse corpus, generalises to significantly out-of-domain fraud-detection data. The authors demonstrate gains on public datasets and show scalable transfer to real-world fraud datasets, including interpretability via embedding-space visualisations that reveal semantic clustering by merchant categories. This work highlights the potential of foundation-model-style pretraining for financial sequences, enabling robust, label-efficient downstream analytics while raising considerations for privacy, bias, and few-shot learning in future research.

Abstract

Machine learning models underpin many modern financial systems for use cases such as fraud detection and churn prediction. Most are based on supervised learning with hand-engineered features, which relies heavily on the availability of labelled data. Large self-supervised generative models have shown tremendous success in natural language processing and computer vision, yet so far they haven't been adapted to multivariate time series of financial transactions. In this paper, we present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions. Benchmarks on public datasets demonstrate that it outperforms state-of-the-art self-supervised methods on a range of downstream tasks. We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions and apply it to the card fraud detection problem on hold-out datasets. The embedding model significantly improves value detection rate at high precision thresholds and transfers well to out-of-domain distributions.

Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences

TL;DR

Abstract

Paper Structure (23 sections, 3 equations, 3 figures, 5 tables)

This paper contains 23 sections, 3 equations, 3 figures, 5 tables.

Introduction
Related Work
Generative modelling on transaction sequences
Proposed method
Model architecture
Experiments on public datasets
Datasets
Hyperparameters
Baselines
Results
Comparison with baseline methods
Importance of constituent tasks
Effect of averaging transaction embeddings
Application to Fraud Detection at scale
Pretrained embedding model
...and 8 more sections

Figures (3)

Figure 1: The NPPR generative modelling framework for pretraining a recurrent encoder using a combination of next event prediction and past reconstruction tasks.
Figure 2: Evaluations of fraud detection models trained on datasets from three different issuers. Three models are shown: baseline hand-engineered features from Jha et al.fraud-model, baseline features with NPPR embeddings trained on the pretraining corpus, and baseline features with NPPR embeddings finetuned on downstream datasets. Error bars were computed from multiple independent training runs.
Figure 3: t-SNE projection of a MCC embedding space. Each MCC embedding was obtained by averaging transaction embeddings corresponding to those MCCs.

Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences

TL;DR

Abstract

Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences

Authors

TL;DR

Abstract

Table of Contents

Figures (3)