FRAUD-RLA: A new reinforcement learning adversarial attack against credit card fraud detection
Daniele Lunghi, Yannick Molinghen, Alkis Simitsis, Tom Lenaerts, Gianluca Bontempi
TL;DR
This paper addresses the underexplored risk of adversarial attacks on credit card fraud detection by introducing FRAUD-RLA, a reinforcement-learning attack that operates under limited attacker knowledge and leverages the exploration-exploitation tradeoff to craft undetected fraudulent transactions. The problem is formulated as a single-step partially observable Markov decision process, with transactions $x = \langle x_c, x_k, x_u \rangle$ and a reward $R(s,a,s') = 1$ if the transaction is classified as genuine, guiding learning via Proximal Policy Optimization to jointly learn means and covariance for a multivariate action distribution. The authors demonstrate, across three heterogeneous datasets and two detectors (Random Forest and Neural Network), that FRAUD-RLA can quickly achieve high success rates and generally outperform a Mimicry baseline, especially against neural detectors, while highlighting the attack’s dependence on feature controllability and unknown features. The work emphasizes the need for defenses against RL-based fraud attacks, discusses ethical considerations, and advocates open science to enable robust red-teaming and reproducibility in fraud-detection robustness research.
Abstract
Adversarial attacks pose a significant threat to data-driven systems, and researchers have spent considerable resources studying them. Despite its economic relevance, this trend largely overlooked the issue of credit card fraud detection. To address this gap, we propose a new threat model that demonstrates the limitations of existing attacks and highlights the necessity to investigate new approaches. We then design a new adversarial attack for credit card fraud detection, employing reinforcement learning to bypass classifiers. This attack, called FRAUD-RLA, is designed to maximize the attacker's reward by optimizing the exploration-exploitation tradeoff and working with significantly less required knowledge than competitors. Our experiments, conducted on three different heterogeneous datasets and against two fraud detection systems, indicate that FRAUD-RLA is effective, even considering the severe limitations imposed by our threat model.
