TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?

Lewen Yan; Jilin Mei; Tianyi Zhou; Lige Huang; Jie Zhang; Dongrui Liu; Jing Shao

TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?

Lewen Yan, Jilin Mei, Tianyi Zhou, Lige Huang, Jie Zhang, Dongrui Liu, Jing Shao

TL;DR

The paper introduces TradeTrap, a framework for system-level stress-testing of LLM-based trading agents by modeling four core components and their attack surfaces. It evaluates robustness through closed-loop backtesting on real US equities, applying targeted perturbations such as data fabrication, prompt injection, memory poisoning, and state tampering. Key findings show that small perturbations can cascade into extreme risk, with Adaptive agents being particularly vulnerable to information-channel attacks, while Procedural agents display some robustness but remain susceptible to state- and memory-based disruptions. The work highlights the need for explicit cross-module security and state-verification mechanisms to build safer, more reliable autonomous trading systems in high-stakes environments.

Abstract

LLM-based trading agents are increasingly deployed in real-world financial markets to perform autonomous analysis and execution. However, their reliability and robustness under adversarial or faulty conditions remain largely unexamined, despite operating in high-risk, irreversible financial environments. We propose TradeTrap, a unified evaluation framework for systematically stress-testing both adaptive and procedural autonomous trading agents. TradeTrap targets four core components of autonomous trading agents: market intelligence, strategy formulation, portfolio and ledger handling, and trade execution, and evaluates their robustness under controlled system-level perturbations. All evaluations are conducted in a closed-loop historical backtesting setting on real US equity market data with identical initial conditions, enabling fair and reproducible comparisons across agents and attacks. Extensive experiments show that small perturbations at a single component can propagate through the agent decision loop and induce extreme concentration, runaway exposure, and large portfolio drawdowns across both agent types, demonstrating that current autonomous trading agents can be systematically misled at the system level. Our code is available at https://github.com/Yanlewen/TradeTrap.

TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?

TL;DR

Abstract

TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)