SINR-Aware Deep Reinforcement Learning for Distributed Dynamic Channel Allocation in Cognitive Interference Networks

Yaniv Cohen; Tomer Gafni; Ronen Greenberg; Kobi Cohen

SINR-Aware Deep Reinforcement Learning for Distributed Dynamic Channel Allocation in Cognitive Interference Networks

Yaniv Cohen, Tomer Gafni, Ronen Greenberg, Kobi Cohen

TL;DR

The paper tackles dynamic channel allocation in large-scale cognitive interference networks where channels are non-orthogonal and inter-carrier interference (ICI) is present. It introduces CARLTON, a Centralized Training with Decentralized Execution (CTDE) multi-agent reinforcement learning framework that uses the DeepMellow value-based learner to maximize a global SINR while satisfying per-network QoS-SINR constraints, via a low-dimensional observation space and a reward structure that combines personal and social welfare. CARLTON achieves strong performance and generalization, outperforming Random Agent and Jammer Avoidance Response (JAR) while approaching centralized graph-coloring performance (within about 2.5%), and demonstrates robust generalization to unseen numbers of networks. The approach yields high channel quality with fast convergence and reduced spectrum mobility, making it a promising scalable solution for distributed channel allocation in interference-rich cognitive networks. Key contributions include a two-tier reward design, neighbor-based social welfare, an observation preprocessing strategy, action masking to avoid collisions, and thorough evaluation in realistic, large-scale simulations.

Abstract

We consider the problem of dynamic channel allocation (DCA) in cognitive communication networks with the goal of maximizing a global signal-to-interference-plus-noise ratio (SINR) measure under a specified target quality of service (QoS)-SINR for each network. The shared bandwidth is partitioned into K channels with frequency separation. In contrast to the majority of existing studies that assume perfect orthogonality or a one- to-one user-channel allocation mapping, this paper focuses on real-world systems experiencing inter-carrier interference (ICI) and channel reuse by multiple large-scale networks. This realistic scenario significantly increases the problem dimension, rendering existing algorithms inefficient. We propose a novel multi-agent reinforcement learning (RL) framework for distributed DCA, named Channel Allocation RL To Overlapped Networks (CARLTON). The CARLTON framework is based on the Centralized Training with Decentralized Execution (CTDE) paradigm, utilizing the DeepMellow value-based RL algorithm. To ensure robust performance in the interference-laden environment we address, CARLTON employs a low-dimensional representation of observations, generating a QoS-type measure while maximizing a global SINR measure and ensuring the target QoS-SINR for each network. Our results demonstrate exceptional performance and robust generalization, showcasing superior efficiency compared to alternative state-of-the-art methods, while achieving a marginally diminished performance relative to a fully centralized approach.

SINR-Aware Deep Reinforcement Learning for Distributed Dynamic Channel Allocation in Cognitive Interference Networks

TL;DR

Abstract

Paper Structure (16 sections, 22 equations, 15 figures, 2 tables, 4 algorithms)

This paper contains 16 sections, 22 equations, 15 figures, 2 tables, 4 algorithms.

Introduction
DRL Algorithms for DCA
Main Results
System Model and Problem Formulation
Description of the System
Illustration of the System Environment in Real-World Simulations
The Objective
The Channel Allocation RL To Overlapped Networks (CARLTON) Algorithm
Observation Space
Action Space
Reward
Training Procedure
Experiments and Discussion
CARLTON's Training Results
Performance Comparison
...and 1 more sections

Figures (15)

Figure 1: An illustration of the non-orthogonal channel partition. Channel $i$, denoted by $C_i$ in the figure, refers to the carrier frequency $f_i$.
Figure 2: An illustration of the simulation environment involving six networks. Each color corresponds to a distinct network, distinguished by a unique serial number. The symbol 'Ma' denotes the network manager present at each respective network.
Figure 3: CARLTON performance: The accumulated rewards as function of episode number during training for implementation of with and without masking approach. In a case of perfect game the maximum value is 88.
Figure 4: CARLTON performance: The empirical results of the mean ($\overline{CQ}$), median ($\widetilde{CQ}$), minimum ($min_{CQ}$) values of channel quality during training.
Figure 5: CARLTON performance: The score values of the average number of channel changes (ANCC), convergence time (CT), spectrum efficiency (SE), and the weighted score (WS) during training.
...and 10 more figures

SINR-Aware Deep Reinforcement Learning for Distributed Dynamic Channel Allocation in Cognitive Interference Networks

TL;DR

Abstract

SINR-Aware Deep Reinforcement Learning for Distributed Dynamic Channel Allocation in Cognitive Interference Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (15)