Limited or Biased: Modeling Sub-Rational Human Investors in Financial Markets

Penghang Liu; Kshama Dwarakanath; Svitlana S Vyetrenko; Tucker Balch

Limited or Biased: Modeling Sub-Rational Human Investors in Financial Markets

Penghang Liu, Kshama Dwarakanath, Svitlana S Vyetrenko, Tucker Balch

TL;DR

The paper tackles the challenge of modeling human sub-rationality in financial markets by proposing a flexible reinforcement learning framework that trains sub-rational investors in a high-fidelity multi-agent market simulator. It defines five sub-rational types (bounded rational, myopic, prospect biased, optimistic, pessimistic) and develops internal-belief models to inject biases, enabling controlled experiments on trading strategies and market impact. Using SHAP for interpretability, the study reveals how each bias shifts reliance on market observables and affects liquidity, volatility, and price efficiency. The findings show distinct trade-offs: bounded rational and prospect-biased behaviors tend to improve liquidity but degrade efficiency, while myopic behavior enhances efficiency but reduces liquidity, with optimistic and pessimistic behaviors producing more nuanced and sometimes destabilizing effects. Overall, the work provides a unified framework for simulating and evaluating sub-rational human investors, offering a path toward understanding their practical implications and guiding regulator-relevant insights in complex market environments.

Abstract

Human decision-making in real-life deviates significantly from the optimal decisions made by fully rational agents, primarily due to computational limitations or psychological biases. While existing studies in behavioral finance have discovered various aspects of human sub-rationality, there lacks a comprehensive framework to transfer these findings into an adaptive human model applicable across diverse financial market scenarios. In this study, we introduce a flexible model that incorporates five different aspects of human sub-rationality using reinforcement learning. Our model is trained using a high-fidelity multi-agent market simulator, which overcomes limitations associated with the scarcity of labeled data of individual investors. We evaluate the behavior of sub-rational human investors using hand-crafted market scenarios and SHAP value analysis, showing that our model accurately reproduces the observations in the previous studies and reveals insights of the driving factors of human behavior. Finally, we explore the impact of sub-rationality on the investor's Profit and Loss (PnL) and market quality. Our experiments reveal that bounded-rational and prospect-biased human behaviors improve liquidity but diminish price efficiency, whereas human behavior influenced by myopia, optimism, and pessimism reduces market liquidity.

Limited or Biased: Modeling Sub-Rational Human Investors in Financial Markets

TL;DR

Abstract

Paper Structure (31 sections, 13 equations, 15 figures, 5 tables)

This paper contains 31 sections, 13 equations, 15 figures, 5 tables.

Introduction
Literature Review
Human Sub-rationality
Human Decision Models
Multi-agent Market Simulations
Background
Limit Order Book (LOB) structure
Metrics and Notations
Measures of Market Quality
Liquidity
Volatility
Market efficiency
Reinforcement Learning for Investor Modeling
Defining RL Investors
Training in Multi-Agent Market Simulations
...and 16 more sections

Figures (15)

Figure 1: A snapshot of the LOB structure.
Figure 2: Training RL agents using market simulations or biased internal beliefs. (left) The RL agent learns a trading strategy by directly interacting with the simulated markets. (right) The RL agent learns from the internal model. We first learn a probabilistic internal model from the samples of the environment. We can inject bias to the internal model and use it to train a biased human.
Figure 3: An example of the Boltzmann rationality model from laidlaw2022boltzmann for three actions with different rewards (left). For demonstration purpose here we simply consider a one step decision problem in a deterministic environment. The Boltzmann model gives the probability of taking an action using the $\beta$ parameter that adjusts the degree of rationality (right). If $\beta = 0$, each action has the same probability to be selected. When $\beta = 10$, the model becomes more rational and only the action with highest reward is likely to be selected.
Figure 4: The behavior of bounded rational investors in the simulated market. Compared to the fully rational investor, a bounded rational human investor takes sub-optimal actions that are similar but inferior.
Figure 5: Comparison of exponential discounting to hyperbolic discounting of $100 over $390$ time steps. See the steady drop in discount rate for the orange exponential curve. On the other hand, the blue hyperbolic curve shows steep discounting early on and modest discounting later on.
...and 10 more figures

Limited or Biased: Modeling Sub-Rational Human Investors in Financial Markets

TL;DR

Abstract

Limited or Biased: Modeling Sub-Rational Human Investors in Financial Markets

Authors

TL;DR

Abstract

Table of Contents

Figures (15)