Table of Contents
Fetching ...

A Distributed Neural Linear Thompson Sampling Framework to Achieve URLLC in Industrial IoT

Francesco Pase, Marco Giordani, Sara Cavallero, Malte Schellmann, Josef Eichinger, Roberto Verdone, Michele Zorzi

TL;DR

DIStributed combinatorial NEural linear Thompson Sampling (DISNETS) is a novel scheduling framework that combines the best of the two worlds, leveraging a feedback signal from the gNB and reinforcement learning, to autonomously optimize their uplink transmissions by selecting the available resources to minimize the number of collisions.

Abstract

Industrial Internet of Things (IIoT) networks will provide Ultra-Reliable Low-Latency Communication (URLLC) to support critical processes underlying the production chains. However, standard protocols for allocating wireless resources may not optimize the latency-reliability trade-off, especially for uplink communication. For example, centralized grant-based scheduling can ensure almost zero collisions, but introduces delays in the way resources are requested by the User Equipments (UEs) and granted by the gNB. In turn, distributed scheduling (e.g., based on random access), in which UEs autonomously choose the resources for transmission, may lead to potentially many collisions especially when the traffic increases. In this work we propose DIStributed combinatorial NEural linear Thompson Sampling (DISNETS), a novel scheduling framework that combines the best of the two worlds. By leveraging a feedback signal from the gNB and reinforcement learning, the UEs are trained to autonomously optimize their uplink transmissions by selecting the available resources to minimize the number of collisions, without additional message exchange to/from the gNB. DISNETS is a distributed, multi-agent adaptation of the Neural Linear Thompson Sampling (NLTS) algorithm, which has been further extended to admit multiple parallel actions. We demonstrate the superior performance of DISNETS in addressing URLLC in IIoT scenarios compared to other baselines.

A Distributed Neural Linear Thompson Sampling Framework to Achieve URLLC in Industrial IoT

TL;DR

DIStributed combinatorial NEural linear Thompson Sampling (DISNETS) is a novel scheduling framework that combines the best of the two worlds, leveraging a feedback signal from the gNB and reinforcement learning, to autonomously optimize their uplink transmissions by selecting the available resources to minimize the number of collisions.

Abstract

Industrial Internet of Things (IIoT) networks will provide Ultra-Reliable Low-Latency Communication (URLLC) to support critical processes underlying the production chains. However, standard protocols for allocating wireless resources may not optimize the latency-reliability trade-off, especially for uplink communication. For example, centralized grant-based scheduling can ensure almost zero collisions, but introduces delays in the way resources are requested by the User Equipments (UEs) and granted by the gNB. In turn, distributed scheduling (e.g., based on random access), in which UEs autonomously choose the resources for transmission, may lead to potentially many collisions especially when the traffic increases. In this work we propose DIStributed combinatorial NEural linear Thompson Sampling (DISNETS), a novel scheduling framework that combines the best of the two worlds. By leveraging a feedback signal from the gNB and reinforcement learning, the UEs are trained to autonomously optimize their uplink transmissions by selecting the available resources to minimize the number of collisions, without additional message exchange to/from the gNB. DISNETS is a distributed, multi-agent adaptation of the Neural Linear Thompson Sampling (NLTS) algorithm, which has been further extended to admit multiple parallel actions. We demonstrate the superior performance of DISNETS in addressing URLLC in IIoT scenarios compared to other baselines.
Paper Structure (38 sections, 8 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 38 sections, 8 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Factory floor layout (with $W=2$, $M=7$, and $N=18$) and traffic correlation. Specifically, machines in each production line are correlated, and activate according to a specific sequence on the production line, i.e., toward the right or the left. At $t_1$, $W=2$ machines (i.e., one per production line) activate, and the corresponding UEs onboard the active machines start sending data as periodic, aperiodic, or UE-specific aperiodic traffic. At $t_2=t_1+\tau_a$, these machines shut down and the next activation begins.
  • Figure 2: Schematic representation of . The framework consists of (i) the state/context $s$, (ii) an module to provide the non-linear representation of the context $\phi_{\omega}(s)$, (iii) an module to choose a super-action $\theta\in\mathcal{K}$ corresponding to the set of orthogonal channels to use to transmit data, (iv) the reward $r$ (incorporated within the FCI) to update the and parameters.
  • Figure 3: Convergence performance of in terms of empirical average and standard deviation of the training loss (top), reward (center), and latency (bottom). We consider uniformly aperiodic traffic, with $t_{min}=2$ ms, $t_{max}=6$ ms, and $N=60$.
  • Figure 4: Overhead performance measured in terms of the size of the (proposed) vs. the 3GPP NR , as a function of the number of orthogonal channels (top) and UEs (bottom). We consider two DCI formats, namely DCI$_m$ and DCI$_M$, which require up to 10 and 37 additional bits, respectively, for resource allocation parkvall20185g.
  • Figure 5: Empirical cdf of the number of orthogonal channels used at each scheduling opportunity relative to the last $10$ packets considering vs. RandomK, as a function of the number of UEs.
  • ...and 4 more figures