Table of Contents
Fetching ...

RIFF: Inducing Rules for Fraud Detection from Decision Trees

João Lucas Martins, João Bravo, Ana Sofia Gomes, Carlos Soares, Pedro Bizarro

TL;DR

This work tackles fraud detection where interpretability and a low false-positive rate are essential. It introduces RIFF, a rule-induction framework that first derives candidate rules from leaves of tree-based models and then greedily selects a small, high-precision subset under the constraint $\text{FPR} \le \text{FPR}_{\max}$; it also proposes FIGU, a binarized-residual variant for additive trees like FIGS to reduce overlap. The authors evaluate on public datasets (BAF, Taiwan Credit) and a private dataset, comparing against CART, FIGS, and expert rules as well as a LightGBM baseline; RIFF consistently improves recall at low $\text{FPR}$ and reduces the resulting rule-set length, with FIGU+RIFF yielding even shorter, competitive rules. They discuss limitations and future directions, including pruning internal nodes and combining rules from multiple trees (e.g., Random Forest or Bagging FIGS) to broaden candidate diversity.

Abstract

Financial fraud is the cause of multi-billion dollar losses annually. Traditionally, fraud detection systems rely on rules due to their transparency and interpretability, key features in domains where decisions need to be explained. However, rule systems require significant input from domain experts to create and tune, an issue that rule induction algorithms attempt to mitigate by inferring rules directly from data. We explore the application of these algorithms to fraud detection, where rule systems are constrained to have a low false positive rate (FPR) or alert rate, by proposing RIFF, a rule induction algorithm that distills a low FPR rule set directly from decision trees. Our experiments show that the induced rules are often able to maintain or improve performance of the original models for low FPR tasks, while substantially reducing their complexity and outperforming rules hand-tuned by experts.

RIFF: Inducing Rules for Fraud Detection from Decision Trees

TL;DR

This work tackles fraud detection where interpretability and a low false-positive rate are essential. It introduces RIFF, a rule-induction framework that first derives candidate rules from leaves of tree-based models and then greedily selects a small, high-precision subset under the constraint ; it also proposes FIGU, a binarized-residual variant for additive trees like FIGS to reduce overlap. The authors evaluate on public datasets (BAF, Taiwan Credit) and a private dataset, comparing against CART, FIGS, and expert rules as well as a LightGBM baseline; RIFF consistently improves recall at low and reduces the resulting rule-set length, with FIGU+RIFF yielding even shorter, competitive rules. They discuss limitations and future directions, including pruning internal nodes and combining rules from multiple trees (e.g., Random Forest or Bagging FIGS) to broaden candidate diversity.

Abstract

Financial fraud is the cause of multi-billion dollar losses annually. Traditionally, fraud detection systems rely on rules due to their transparency and interpretability, key features in domains where decisions need to be explained. However, rule systems require significant input from domain experts to create and tune, an issue that rule induction algorithms attempt to mitigate by inferring rules directly from data. We explore the application of these algorithms to fraud detection, where rule systems are constrained to have a low false positive rate (FPR) or alert rate, by proposing RIFF, a rule induction algorithm that distills a low FPR rule set directly from decision trees. Our experiments show that the induced rules are often able to maintain or improve performance of the original models for low FPR tasks, while substantially reducing their complexity and outperforming rules hand-tuned by experts.
Paper Structure (7 sections, 7 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 7 sections, 7 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: RIFF Overview
  • Figure 2: Extracting rules from a FIGS model