Table of Contents
Fetching ...

Financial Transaction Retrieval and Contextual Evidence for Knowledge-Grounded Reasoning

Artem Sakhno, Daniil Tomilov, Yuliana Shakhvalieva, Inessa Fedorova, Daria Ruzanova, Omar Zoloev, Andrey Savchenko, Maksim Makarenko

Abstract

Nowadays, success of financial organizations heavily depends on their ability to process digital traces generated by their clients, e.g., transaction histories, gathered from various sources to improve user modeling pipelines. As general-purpose LLMs struggle with time-distributed tabular data, production stacks still depend on specialized tabular and sequence models with limited transferability and need for labeled data. To address this, we introduce FinTRACE, a retrieval-first architecture that converts raw transactions into reusable feature representations, applies rule-based detectors, and stores the resulting signals in a behavioral knowledge base with graded associations to the objectives of downstream tasks. Across public and industrial benchmarks, FinTRACE substantially improves low-supervision transaction analytics, doubling zero-shot MCC on churn prediction performance from 0.19 to 0.38 and improving 16-shot MCC from 0.25 to 0.40. We further use FinTRACE to ground LLMs via instruction tuning on retrieved behavioral patterns, achieving state-of-the-art LLM results on transaction analytics problems.

Financial Transaction Retrieval and Contextual Evidence for Knowledge-Grounded Reasoning

Abstract

Nowadays, success of financial organizations heavily depends on their ability to process digital traces generated by their clients, e.g., transaction histories, gathered from various sources to improve user modeling pipelines. As general-purpose LLMs struggle with time-distributed tabular data, production stacks still depend on specialized tabular and sequence models with limited transferability and need for labeled data. To address this, we introduce FinTRACE, a retrieval-first architecture that converts raw transactions into reusable feature representations, applies rule-based detectors, and stores the resulting signals in a behavioral knowledge base with graded associations to the objectives of downstream tasks. Across public and industrial benchmarks, FinTRACE substantially improves low-supervision transaction analytics, doubling zero-shot MCC on churn prediction performance from 0.19 to 0.38 and improving 16-shot MCC from 0.25 to 0.40. We further use FinTRACE to ground LLMs via instruction tuning on retrieved behavioral patterns, achieving state-of-the-art LLM results on transaction analytics problems.
Paper Structure (7 sections, 4 figures, 4 tables)

This paper contains 7 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Generalization gap between LLM and specialized pipelines. In few-shot settings (0/4/16 labels), LLM methods (GPT-OSS, Llama-3-Instruct, Reasoning Bank) plateau below 0.30 MCC, while specialized models (TabPFN, CatBoost, LLM4ES, CoLES) exceed 0.48 MCC only with full supervision. The shaded region marks this gap. This work (FinTRACE) narrows it, achieving 0.38 MCC in zero-shot and 0.48 with full labels.
  • Figure 2: FinTRACE overview. Raw transaction logs (a) are transformed into a structured knowledge base (b) with three semantic layers: feature essences, behavioral patterns, and downstream targets. Explicit white-box rules connect these layers (c), allowing the LLM to produce grounded predictions supported by traceable evidence chains (d).
  • Figure 3: Comparison of LLM approaches on a private dataset.
  • Figure 4: Impact of reflection across different shot budgets.