Table of Contents
Fetching ...

FinPos: A Position-Aware Trading Agent System for Real Financial Markets

Bijia Liu, Ronghao Dang

TL;DR

The paper introduces FinPos, a position-aware trading agent that uses a dual-decision architecture, a market signal processing module with domain-specific analysts, and a multi-timescale reward scheme to train long-horizon, risk-aware trading behavior. By explicitly modeling and managing positions, FinPos overcomes the myopia of single-step LLM trading agents and demonstrates superior profitability and drawdown control across multiple real stocks, including high-volatility names. Key contributions include the division of direction and quantity/risk decisions, a hierarchical memory framework to structure market information, prompt-based financial reasoning enhancements, and a multi-timescale reflection mechanism that aligns actions with short-, mid-, and long-term market signals. The results suggest substantial potential for LLM-centered systems in long-horizon market decision-making, while the authors acknowledge limitations in multi-asset coordination and broader asset class generalization.

Abstract

The exceptional potential of large language models (LLMs) in handling text information has garnered significant attention in the field of financial trading. However, current trading agents primarily focus on single-step trading tasks and lack awareness of continuous position management. Therefore, we propose a position-aware trading task designed to simulate a more realistic market. To address this task, we develop a trading agent system, FinPos, optimized for position management. FinPos is able to interpret various types of market information from a professional perspective, providing a reliable basis for positioning decisions. To mitigate the substantial market risks arising from position fluctuations, FinPos employs dual decision agents. Furthermore, the continuous nature of position management necessitates our adoption of multi-timescale rewards, which in turn empowers FinPos to effectively balance short-term fluctuations against long-term trends. Extensive experiments demonstrate that FinPos surpasses state-of-the-art trading agents in the position-aware trading task, which closely mirrors real market conditions. More importantly, our findings reveal that LLM-centered agent systems exhibit a vast, largely unexplored potential in long-term market decision-making.

FinPos: A Position-Aware Trading Agent System for Real Financial Markets

TL;DR

The paper introduces FinPos, a position-aware trading agent that uses a dual-decision architecture, a market signal processing module with domain-specific analysts, and a multi-timescale reward scheme to train long-horizon, risk-aware trading behavior. By explicitly modeling and managing positions, FinPos overcomes the myopia of single-step LLM trading agents and demonstrates superior profitability and drawdown control across multiple real stocks, including high-volatility names. Key contributions include the division of direction and quantity/risk decisions, a hierarchical memory framework to structure market information, prompt-based financial reasoning enhancements, and a multi-timescale reflection mechanism that aligns actions with short-, mid-, and long-term market signals. The results suggest substantial potential for LLM-centered systems in long-horizon market decision-making, while the authors acknowledge limitations in multi-asset coordination and broader asset class generalization.

Abstract

The exceptional potential of large language models (LLMs) in handling text information has garnered significant attention in the field of financial trading. However, current trading agents primarily focus on single-step trading tasks and lack awareness of continuous position management. Therefore, we propose a position-aware trading task designed to simulate a more realistic market. To address this task, we develop a trading agent system, FinPos, optimized for position management. FinPos is able to interpret various types of market information from a professional perspective, providing a reliable basis for positioning decisions. To mitigate the substantial market risks arising from position fluctuations, FinPos employs dual decision agents. Furthermore, the continuous nature of position management necessitates our adoption of multi-timescale rewards, which in turn empowers FinPos to effectively balance short-term fluctuations against long-term trends. Extensive experiments demonstrate that FinPos surpasses state-of-the-art trading agents in the position-aware trading task, which closely mirrors real market conditions. More importantly, our findings reveal that LLM-centered agent systems exhibit a vast, largely unexplored potential in long-term market decision-making.

Paper Structure

This paper contains 44 sections, 11 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: With the introduction of position awareness, the agent must not only predict current market trends but also manage the remaining positions in the account. Agents developed for tasks without position awareness are inadequate for addressing the new challenges posed by position-aware trading tasks.
  • Figure 2: Architectural Details of FinPos: Initially, multiple analysis agents leverage domain knowledge to gather diverse information from the environment, subsequently storing it in the memory module. The memory module utilizes a memory allocator to distribute the acquired information across memory layers of varying depths. Subsequently, the most pertinent information for the current decision is placed into working memory, where dual decision agents output trading actions. Finally, multi-timescale rewards guide reflection, storing experiential knowledge into deeper memory layers.
  • Figure 3: We conducted a fine-grained analysis of agent prompts along the dimensions of eight characteristics and three levels of emphasis.
  • Figure 4: The impact of varying the maximum timescale of the multi-timescale reward on various performance metrics.
  • Figure 5: Profitability versus risk dynamics on TSLA. Top-left: Calmar ratio, capturing return relative to maximum drawdown (higher is better). Bottom-left: joint view of profitability (CR%) and risk-control (1 - MDD), indicating return–risk balance. Right: time-series comparison of cumulative return (bottom) and exposure risk (top) across major events. Vertical dashed lines mark the occurrence of major events.
  • ...and 1 more figures