Centralized Reduction of Decentralized Stochastic Control Models and their weak-Feller Regularity
Omar Mrani-Zentar, Serdar Yüksel
TL;DR
The paper tackles decentralized stochastic control with general state/measurement/action spaces by showing that one-step delayed and K-step periodic information-sharing problems can be reduced to centralized MDPs using state predictors. It establishes conditions under which these centralized reductions are weak-Feller, enabling the existence and stationarity of optimal policies and supporting rigorous approximation and learning analyses. By leveraging the Young topology for action mappings and proposing finite-action and finite-state quantizations, the work provides practical pathways for numerical solutions and near-optimal learning in multi-agent settings. The results also connect to completely decentralized information structures by suggesting how large-period approximations can yield near-optimal solutions for CDIS, thereby bridging theory and computation in decentralized stochastic control.
Abstract
Decentralized stochastic control problems involving general state/measurement/action spaces are intrinsically difficult to study because of the inapplicability of standard tools from centralized (single-agent) stochastic control. In this paper, we address some of these challenges for decentralized stochastic control with standard Borel spaces under two different but tightly related information structures: the one-step delayed information sharing pattern (OSDISP), and the $K$-step periodic information sharing pattern (KSPISP). We will show that the one-step delayed and $K$-step periodic problems can be reduced to a centralized Markov Decision Process (MDP), generalizing prior results which considered finite, linear, or static models, by addressing several measurability and topological questions. We then provide sufficient conditions for the transition kernels of both centralized reductions to be weak-Feller. The existence and separated nature of optimal policies under both information structures are then established. The weak Feller regularity also facilitates rigorous approximation and learning theoretic results, as shown in the paper.
