Transparency as Delayed Observability in Multi-Agent Systems
Kshama Dwarakanath, Svitlana Vyetrenko, Toks Oyebode, Tucker Balch
TL;DR
This work formalizes transparency in multi-agent systems as delayed observability of environment states, parameterized by a delay $\delta$, and studies its impact on agent strategies and social welfare via a learning-based framework. It introduces a POSG/MARL approach with two agent archetypes—constrained and unconstrained—trained in a simulated financial market using PPO within the ABIDES environment, and defines social welfare as a product of equality and profitability using $SWF(Y)=\exp\left(-GE_{\kappa}(Y)\right)\times\Bar{Y}$ or $SWF(Y)=\exp\left(-Theil_{L}(Y)\right)\times\Bar{Y}$. The empirical results show opposing effects of delay on the two agent types: constrained agents benefit from higher delay (lower observability) while unconstrained agents benefit from lower delay, and overall social welfare peaks at an intermediate level of transparency ($\delta\approx300$). These findings suggest that partial transparency regimes can maximize welfare in complex MAS settings and have practical implications for policy design in markets and other dynamic systems, where information release must balance individual incentives with collective outcomes.
Abstract
Is transparency always beneficial in complex systems such as traffic networks and stock markets? How is transparency defined in multi-agent systems, and what is its optimal degree at which social welfare is highest? We take an agent-based view to define transparency (or its lacking) as delay in agent observability of environment states, and utilize simulations to analyze the impact of delay on social welfare. To model the adaptation of agent strategies with varying delays, we model agents as learners maximizing the same objectives under different delays in a simulated environment. Focusing on two agent types - constrained and unconstrained, we use multi-agent reinforcement learning to evaluate the impact of delay on agent outcomes and social welfare. Empirical demonstration of our framework in simulated financial markets shows opposing trends in outcomes of the constrained and unconstrained agents with delay, with an optimal partial transparency regime at which social welfare is maximal.
