Table of Contents
Fetching ...

Risk Sensitivity in Markov Games and Multi-Agent Reinforcement Learning: A Systematic Review

Hafez Ghaemi, Shirin Jamshidi, Mohammad Mashreghi, Majid Nili Ahmadabadi, Hamed Kebriaei

TL;DR

This paper addresses how risk attitudes shape decision making in MGs and MARL, arguing that risk-neutral formulations fail in domains like finance and autonomous systems. It offers a comprehensive taxonomy of risk measures, spanning explicit (exponential, coherent, CPT) and implicit (variance, CVaR, chance constraint) classes, and synthesizes 59 studies to reveal methodological and application-driven trends. The review highlights theoretical guarantees (existence of NE/ME, HJI/Poisson equations) and practical algorithms (distributional RL, AIM/ADMM, CPT-based policies) that advance risk-sensitive multi-agent learning. By mapping how risk measures influence equilibria, learning dynamics, and policy design, the work informs both theory and real-world deployment of cooperative and competitive MAS under uncertainty.

Abstract

Markov games (MGs) and multi-agent reinforcement learning (MARL) are studied to model decision making in multi-agent systems. Traditionally, the objective in MG and MARL has been risk-neutral, i.e., agents are assumed to optimize a performance metric such as expected return, without taking into account subjective or cognitive preferences of themselves or of other agents. However, ignoring such preferences leads to inaccurate models of decision making in many real-world scenarios in finance, operations research, and behavioral economics. Therefore, when these preferences are present, it is necessary to incorporate a suitable measure of risk into the optimization objective of agents, which opens the door to risk-sensitive MG and MARL. In this paper, we systemically review the literature on risk sensitivity in MG and MARL that has been growing in recent years alongside other areas of reinforcement learning and game theory. We define and mathematically describe different risk measures used in MG and MARL and individually for each measure, discuss articles that incorporate it. Finally, we identify recent trends in theoretical and applied works in the field and discuss possible directions of future research.

Risk Sensitivity in Markov Games and Multi-Agent Reinforcement Learning: A Systematic Review

TL;DR

This paper addresses how risk attitudes shape decision making in MGs and MARL, arguing that risk-neutral formulations fail in domains like finance and autonomous systems. It offers a comprehensive taxonomy of risk measures, spanning explicit (exponential, coherent, CPT) and implicit (variance, CVaR, chance constraint) classes, and synthesizes 59 studies to reveal methodological and application-driven trends. The review highlights theoretical guarantees (existence of NE/ME, HJI/Poisson equations) and practical algorithms (distributional RL, AIM/ADMM, CPT-based policies) that advance risk-sensitive multi-agent learning. By mapping how risk measures influence equilibria, learning dynamics, and policy design, the work informs both theory and real-world deployment of cooperative and competitive MAS under uncertainty.

Abstract

Markov games (MGs) and multi-agent reinforcement learning (MARL) are studied to model decision making in multi-agent systems. Traditionally, the objective in MG and MARL has been risk-neutral, i.e., agents are assumed to optimize a performance metric such as expected return, without taking into account subjective or cognitive preferences of themselves or of other agents. However, ignoring such preferences leads to inaccurate models of decision making in many real-world scenarios in finance, operations research, and behavioral economics. Therefore, when these preferences are present, it is necessary to incorporate a suitable measure of risk into the optimization objective of agents, which opens the door to risk-sensitive MG and MARL. In this paper, we systemically review the literature on risk sensitivity in MG and MARL that has been growing in recent years alongside other areas of reinforcement learning and game theory. We define and mathematically describe different risk measures used in MG and MARL and individually for each measure, discuss articles that incorporate it. Finally, we identify recent trends in theoretical and applied works in the field and discuss possible directions of future research.
Paper Structure (25 sections, 22 equations, 2 figures)

This paper contains 25 sections, 22 equations, 2 figures.

Figures (2)

  • Figure 1: (Left) Conventional CPT weighting functions; $\omega^+(p) = \frac{p^{\gamma}}{(p^{\gamma} + (1-p)^{\gamma})^{(1/\gamma)}}$ and $\omega^-(p) = \frac{p^{\delta}}{(p^{\delta} + (1-p)^{\delta})^{(1/\delta)}}$ with $\gamma=\delta=0.69$. (Right) Conventional CPT utility functions; $u^+(x)=x^{\alpha}$ for $x\geq 0$, and $-u^-(x)=-\lambda(-x)^{\beta})$ for $x<0$, with $\alpha=\beta=0.65$ and $\lambda=2.6$.
  • Figure 2: Number of articles on risk-sensitive MG and MARL over the years

Theorems & Definitions (1)

  • Definition 1: Markov game