Achieving Pareto Optimality in Games via Single-bit Feedback
Seref Taha Kiremitci, Ahmed Said Donmez, Muhammed O. Sayin
TL;DR
Coordination in multi-agent systems under severe communication constraints is challenging. The authors introduce SBC-PE, a fully decentralized explore-then-commit mechanism that uses a single-bit signal per agent per round to maximize the social welfare $W(a)=\sum_{i=1}^n w_i\,u_i(a)$ in arbitrary finite games. Key contributions include (i) a simple, state-free protocol that requires only one-bit communication, (ii) finite-time guarantees with $\mathbb{E}[R_T]=O(\log T)$ and an explicit exploration length $K=\tfrac{\log(4MT\xi^2)}{2\xi^2}$, and (iii) rigorous regret analysis corroborated by simulations showing scalability and robustness. The work demonstrates that scalable welfare optimization is achievable under minimal communication, with convergence to the exact Pareto-optimal joint action in finite time.
Abstract
Efficient coordination in multi-agent systems often incurs high communication overhead or slow convergence rates, making scalable welfare optimization difficult. We propose Single-Bit Coordination Dynamics for Pareto-Efficient Outcomes (SBC-PE), a decentralized learning algorithm requiring only a single-bit satisfaction signal per agent each round. Despite this extreme efficiency, SBC-PE guarantees convergence to the exact optimal solution in arbitrary finite games. We establish explicit regret bounds, showing expected regret grows only logarithmically with the horizon, i.e., O(log T). Compared with prior payoff-based or bandit-style rules, SBC-PE uniquely combines minimal signaling, general applicability, and finite-time guarantees. These results show scalable welfare optimization is achievable under minimal communication constraints.
