Learning a Decentralized Medium Access Control Protocol for Shared Message Transmission
Lorenzo Mario Amorosa, Zhan Gao, Roberto Verdone, Petar Popovski, Deniz Gündüz
TL;DR
This paper tackles decentralized medium access control for delivering multiple shared messages in large-scale IoT, where inter-node coordination is unavailable. It proves that the optimal solutions are deterministic and the problem is NP-hard, and introduces a decentralized unsupervised learning framework (DUSL) with per-message DNNs and an online adaptation mechanism to handle time-varying activation patterns. Through CTDE-based training, REINFORCE-style updates, and a deterministic inference path, DUSL achieves scalable, collision-tolerant MAC decisions that outperform distributed MAB baselines, especially as network size grows. The work provides theoretical performance bounds under distribution shifts and demonstrates robust, rapid adaptation to dynamic environments, making it suitable for URLLC and energy-constrained IoT deployments.
Abstract
In large-scale Internet of things networks, efficient medium access control (MAC) is critical due to the growing number of devices competing for limited communication resources. In this work, we consider a new challenge in which a set of nodes must transmit a set of shared messages to a central controller, without inter-node communication or retransmissions. Messages are distributed among random subsets of nodes, which must implicitly coordinate their transmissions over shared communication opportunities. The objective is to guarantee the delivery of all shared messages, regardless of which nodes transmit them. We first prove the optimality of deterministic strategies, and characterize the success rate degradation of a deterministic strategy under dynamic message-transmission patterns. To solve this problem, we propose a decentralized learning-based framework that enables nodes to autonomously synthesize deterministic transmission strategies aiming to maximize message delivery success, together with an online adaptation mechanism that maintains stable performance in dynamic scenarios. Extensive simulations validate the framework's effectiveness, scalability, and adaptability, demonstrating its robustness to varying network sizes and fast adaptation to dynamic changes in transmission patterns, outperforming existing multi-armed bandit approaches.
