Table of Contents
Fetching ...

Simulating the Economic Impact of Rationality through Reinforcement Learning and Agent-Based Modelling

Simone Brusatin, Tommaso Padoan, Andrea Coletta, Domenico Delli Gatti, Aldo Glielmo

TL;DR

This work proposes a ‘Rational macro ABM’ (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature and finds that a higher number of rational (RL) agents in the economy always improves the macroeconomic environment as measured by total output.

Abstract

Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions. However, agents within an ABM follow predetermined 'bounded rational' behavioural rules which can be cumbersome to design and difficult to justify. Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of 'fully rational' agents that learn their policy by interacting with the environment and maximising a reward function. Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature. We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for studying the impact of rationality on the economy. We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality. We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits. Finally, we find that a higher number of rational (RL) agents in the economy always improves the macroeconomic environment as measured by total output. Depending on the specific rational policy, this can come at the cost of higher instability. Our R-MABM framework allows for stable multi-agent learning, is available in open source, and represents a principled and robust direction to extend economic simulators.

Simulating the Economic Impact of Rationality through Reinforcement Learning and Agent-Based Modelling

TL;DR

This work proposes a ‘Rational macro ABM’ (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature and finds that a higher number of rational (RL) agents in the economy always improves the macroeconomic environment as measured by total output.

Abstract

Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions. However, agents within an ABM follow predetermined 'bounded rational' behavioural rules which can be cumbersome to design and difficult to justify. Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of 'fully rational' agents that learn their policy by interacting with the environment and maximising a reward function. Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature. We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for studying the impact of rationality on the economy. We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality. We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits. Finally, we find that a higher number of rational (RL) agents in the economy always improves the macroeconomic environment as measured by total output. Depending on the specific rational policy, this can come at the cost of higher instability. Our R-MABM framework allows for stable multi-agent learning, is available in open source, and represents a principled and robust direction to extend economic simulators.
Paper Structure (9 sections, 7 equations, 4 figures, 1 table)

This paper contains 9 sections, 7 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: The R-MABM model. The left panel shows a schematic diagram of the basis model, in which 4 types of agents (green ovals) exchange goods (yellow rectangles). The arrows represent the flow of the specific good, from provider to receiver. The middle panel illustrates the extension of the basis model we implement with our R-MABM framework. Consumption-good producing firms 'C-firms' in the standard model are exclusively 'bounded rational' and take decisions using a heuristic trend-following rule. These are augmented with 'fully rational' RL agents that take decisions in order to maximise profits. The right panel shows a typical learning curve where fully rational RL agents learn to accumulate higher profits than bounded rational agents as the number of learning episodes progresses.
  • Figure 2: Different emerging strategies for different economic environments. The middle panel (B) shows the time series of observed sales ($Y^s$) and price-deltas ($\tilde{\Delta}^P$) for a sample bounded rational agent (dashed line) and RL agent (solid line) under 3 different combinations of market competition ($z_c$) and degree of rationality ($N$) as indicated by the arrows. The 3 combinations give rise to 3 different strategies for the RL agents. From top to bottom, we find 'perfect competition', 'dumping' and 'market power' strategies (see main text for more details). The change in the emerging strategy for different economic environments is highlighted in the left (A) and right (C) panels, which plot the average value of price-delta (left axes) and sales (right axes) as a function of $z_c$ and $N$ respectively.
  • Figure 3: Independent RL-agents spontaneously segregate into strategic groups increasing overall profits. The left panel (A) shows the total mean cumulative rewards (i.e., profits) for a varying number of agents, with $z_c=5$, and for RL agents with shared or independent policies. Agents with independent policies always achieve higher overall rewards as a result of spontaneous segregation into strategic groups. Segregation can be clearly observed in the middle and right panels (B) and (C), showing the value of agent-specific price delta and sales as a function of a growing number of rational agents, with $z_c=5$. A small noise was added to the $x$-axis to better resolve very nearby points. The different strategic groups are particularly easy to spot when plotting price delta against sales, as done in the inset of (A) for $N=20$.
  • Figure 4: RL agents always increase total macroeconomic output, but only perfect competition also increases economic stability. Mean (A), and standard deviation (B) of the GDP in the last 1000 steps of simulations with a growing number of rational agents, trained either with shared or independent policies. Error bars represent the standard error on average quantities. The last panel (C) shows the response function of consumption, GDP and GDP deflator to a positive shock in the propensity to consume of all households. All results of the figure are obtained with $z_c=5$ and the impulse responses are obtained with $N=7$.