Centralized Reduction of Decentralized Stochastic Control Models and their weak-Feller Regularity

Omar Mrani-Zentar; Serdar Yüksel

Centralized Reduction of Decentralized Stochastic Control Models and their weak-Feller Regularity

Omar Mrani-Zentar, Serdar Yüksel

TL;DR

The paper tackles decentralized stochastic control with general state/measurement/action spaces by showing that one-step delayed and K-step periodic information-sharing problems can be reduced to centralized MDPs using state predictors. It establishes conditions under which these centralized reductions are weak-Feller, enabling the existence and stationarity of optimal policies and supporting rigorous approximation and learning analyses. By leveraging the Young topology for action mappings and proposing finite-action and finite-state quantizations, the work provides practical pathways for numerical solutions and near-optimal learning in multi-agent settings. The results also connect to completely decentralized information structures by suggesting how large-period approximations can yield near-optimal solutions for CDIS, thereby bridging theory and computation in decentralized stochastic control.

Abstract

Decentralized stochastic control problems involving general state/measurement/action spaces are intrinsically difficult to study because of the inapplicability of standard tools from centralized (single-agent) stochastic control. In this paper, we address some of these challenges for decentralized stochastic control with standard Borel spaces under two different but tightly related information structures: the one-step delayed information sharing pattern (OSDISP), and the $K$-step periodic information sharing pattern (KSPISP). We will show that the one-step delayed and $K$-step periodic problems can be reduced to a centralized Markov Decision Process (MDP), generalizing prior results which considered finite, linear, or static models, by addressing several measurability and topological questions. We then provide sufficient conditions for the transition kernels of both centralized reductions to be weak-Feller. The existence and separated nature of optimal policies under both information structures are then established. The weak Feller regularity also facilitates rigorous approximation and learning theoretic results, as shown in the paper.

Centralized Reduction of Decentralized Stochastic Control Models and their weak-Feller Regularity

TL;DR

Abstract

-step periodic information sharing pattern (KSPISP). We will show that the one-step delayed and

-step periodic problems can be reduced to a centralized Markov Decision Process (MDP), generalizing prior results which considered finite, linear, or static models, by addressing several measurability and topological questions. We then provide sufficient conditions for the transition kernels of both centralized reductions to be weak-Feller. The existence and separated nature of optimal policies under both information structures are then established. The weak Feller regularity also facilitates rigorous approximation and learning theoretic results, as shown in the paper.

Paper Structure (24 sections, 15 theorems, 83 equations)

This paper contains 24 sections, 15 theorems, 83 equations.

Problem Description
Various information structures
Literature Review and Contributions
Literature review
Main contributions
Notation and Preliminaries
Notation
Convergence of probability measures
Equivalent Formulations via Centralized MDP Reductions
One-step delayed information sharing pattern as a centralized MDP
$K$-step periodic information sharing pattern as a centralized MDP
Weak-Feller Property of the Equivalent MDPs, Existence of Optimal Policies, and their Rigorous Approximations
Weak-Feller property for the one-step delayed sharing pattern
Weak-Feller property for the K-step periodic information sharing pattern
Examples
...and 9 more sections

Key Result

Theorem 3.1

The problem (Prob 1),(obs 1),(cost) is equivalent to one where the team policy at time $t$ is given by $(\Tilde{\gamma}_{t}^{1},...,\Tilde{\gamma}_{t}^{N})$ such that for all $i$, $\Tilde{\gamma}^{i}_{t}:I_{t}^{C} \mapsto f^{i}_{t}$ is $\sigma(I_{t}^{C})$-measurable and $f^{i}_{t}:y^{i}_{t} \mapsto

Theorems & Definitions (28)

Remark 1.1
Definition 3.1
Remark 3.1
Theorem 3.1
Theorem 3.2
Theorem 3.3
Remark 3.2
Theorem 3.4
Remark 3.3
Remark 3.5: $N$-step delayed sharing pattern with $N\geq 2$
...and 18 more

Centralized Reduction of Decentralized Stochastic Control Models and their weak-Feller Regularity

TL;DR

Abstract

Centralized Reduction of Decentralized Stochastic Control Models and their weak-Feller Regularity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (28)