Time-inconsistent mean-field stopping problems: A regularized equilibrium approach

Xiang Yu; Fengyi Yuan

Time-inconsistent mean-field stopping problems: A regularized equilibrium approach

Xiang Yu, Fengyi Yuan

TL;DR

This work tackles time-inconsistent mean-field stopping problems under a general non-exponential discount by introducing randomized (relaxed) equilibria and an entropy-regularized auxiliary problem. It develops a fixed-point framework with Gibbs-type policies, proving the existence of regularized equilibria for any $λ>0$ via Schauder's theorem and showing convergence to relaxed equilibria as $λ\to0$, without requiring decreasing impatience. The paper also builds strong links between the MF-MDP, its McKean–Vlasov limit, and finite-$N$ agent problems, demonstrating that regularized equilibria provide accurate $ε$-equilibria for large $N$. Two illustrative examples—one with explicit relaxed equilibrium and one model-free ETF put exercise—demonstrate applicability and potential for RL-based computation in practice.

Abstract

This paper studies the mean-field Markov decision process (MDP) with the centralized stopping under the non-exponential discount. The problem differs fundamentally from most existing studies on mean-field optimal control/stopping due to its time inconsistency by nature. We look for the subgame perfect relaxed equilibria, namely the randomized stopping policies that satisfy the time-consistent planning with future selves from the perspective of the social planner. On the other hand, unlike many previous studies on time-inconsistent stopping where the decreasing impatience plays a key role, we are interested in the general discount function without imposing any conditions. As a result, the study on the relaxed equilibrium becomes necessary as the pure-strategy equilibrium may not exist in general. We formulate relaxed equilibria as fixed points of a complicated operator, whose existence is challenging by a direct method. To overcome the obstacles, we first introduce the auxiliary problem under the entropy regularization on the randomized policy and the discount function, and establish the existence of the regularized equilibria as fixed points to an auxiliary operator via Schauder fixed point theorem. Next, we show that the regularized equilibrium converges as the regularization parameter $λ$ tends to $0$ and the limit corresponds to a fixed point to the original operator, and hence is a relaxed equilibrium. We also establish some connections between the mean-field MDP and the N-agent MDP when $N$ is sufficiently large in our time-inconsistent setting.

Time-inconsistent mean-field stopping problems: A regularized equilibrium approach

TL;DR

via Schauder's theorem and showing convergence to relaxed equilibria as

, without requiring decreasing impatience. The paper also builds strong links between the MF-MDP, its McKean–Vlasov limit, and finite-

agent problems, demonstrating that regularized equilibria provide accurate

-equilibria for large

. Two illustrative examples—one with explicit relaxed equilibrium and one model-free ETF put exercise—demonstrate applicability and potential for RL-based computation in practice.

Abstract

tends to

and the limit corresponds to a fixed point to the original operator, and hence is a relaxed equilibrium. We also establish some connections between the mean-field MDP and the N-agent MDP when

is sufficiently large in our time-inconsistent setting.

Paper Structure (21 sections, 26 theorems, 157 equations, 7 figures, 1 table, 2 algorithms)

This paper contains 21 sections, 26 theorems, 157 equations, 7 figures, 1 table, 2 algorithms.

Introduction
Literature Review
Our Contributions
Problem Formulation
Notations
Mean-field Markov Decision Process with Stopping
Relaxed Equilibria and Regularized Equilibria
Standing Assumptions
Main Results
Connections to the $N$-agent Problem
The Convergence of ( N-MDP) to ( Limit-MDP)
The Relationship between ( Limit-MDP) and ( MF-MDP)
Approximated Equilibrium of ( N-MDP)
Examples
An Example with Explicit Solution: R&D Project Announcement
...and 6 more sections

Key Result

Lemma 2.1

Let $\tilde{\mathbb{P}}^{\mu}$ be the probability measure induced by the transition rule $\mu_{k+1}=T_0(\mu_k,Z^0)$ and the initial condition $\mu_0=\mu$, and let $\tilde{\mathbb{E}}^{\mu}$ denote its expectation. Then for any $\bm{\phi}\in \mathcal{F}$, $\mu\in \bar{S}$ and $k\in \mathbb{T}$, it ho From this point onwards, it is assumed by convention that $\prod_{k=0}^{-1}\equiv 1$.

Figures (7)

Figure 1: Relationship among different MDP models.
Figure 2:
Figure 3:
Figure 4: Simulated data from \ref{['exm:marketreturn']} and \ref{['exm:index']} v.s. Real-world data
Figure 5: The output policy of Algorithm \ref{['exm:algo:policyiteration']}. The right panel contains zoom-in figures of the region indicated by black rectangles in the left panel.
...and 2 more figures

Theorems & Definitions (74)

Lemma 2.1
proof
Remark 1
Definition 2.2
Remark 2
Definition 2.3
Remark 3
Example 1
Proposition 3.1
proof
...and 64 more

Time-inconsistent mean-field stopping problems: A regularized equilibrium approach

TL;DR

Abstract

Time-inconsistent mean-field stopping problems: A regularized equilibrium approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (74)