FraudTransformer: Time-Aware GPT for Transaction Fraud Detection
Gholamali Aminian, Andrew Elliott, Tiger Li, Timothy Cheuk Hin Wong, Victor Claude Dehon, Lukasz Szpruch, Carsten Maple, Christopher Read, Martin Brown, Gesine Reinert, Mo Mamouei
TL;DR
FraudTransformer introduces a GPT-like sequence model augmented with time encoders and a learned positional encoder to capture temporal structure in irregular transaction streams. By enabling both absolute and relative, as well as sinusoidal and rotary time representations, the approach achieves superior $AUROC$ and $PRAUC$ on a large HSBC dataset compared to classical baselines and time-free GPT variants. The strongest configuration uses event-level relative sinusoidal time with a learned positional encoder and LayerNorm, illustrating that fine-grained temporal signals beyond raw features significantly improve fraud detection. The work suggests practical gains for real-time monitoring and points to future directions such as pretraining, multi-class fraud subtype handling, and longer context lengths.
Abstract
Detecting payment fraud in real-world banking streams requires models that can exploit both the order of events and the irregular time gaps between them. We introduce FraudTransformer, a sequence model that augments a vanilla GPT-style architecture with (i) a dedicated time encoder that embeds either absolute timestamps or inter-event values, and (ii) a learned positional encoder that preserves relative order. Experiments on a large industrial dataset -- tens of millions of transactions and auxiliary events -- show that FraudTransformer surpasses four strong classical baselines (Logistic Regression, XGBoost and LightGBM) as well as transformer ablations that omit either the time or positional component. On the held-out test set it delivers the highest AUROC and PRAUC.
