Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

Talha Bozkus; Urbashi Mitra

Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

Talha Bozkus, Urbashi Mitra

TL;DR

Numerical simulations show that the proposed algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms and exhibits robustness to changes in network settings and parameters.

Abstract

Q-learning is widely used to optimize wireless networks with unknown system dynamics. Recent advancements include ensemble multi-environment hybrid Q-learning algorithms, which utilize multiple Q-learning algorithms across structurally related but distinct Markovian environments and outperform existing Q-learning algorithms in terms of accuracy and complexity in large-scale wireless networks. We herein conduct a comprehensive coverage analysis to ensure optimal data coverage conditions for these algorithms. Initially, we establish upper bounds on the expectation and variance of different coverage coefficients. Leveraging these bounds, we present an algorithm for efficient initialization of these algorithms. We test our algorithm on two distinct real-world wireless networks. Numerical simulations show that our algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms. Furthermore, our algorithm exhibits robustness to changes in network settings and parameters. We also numerically validate our theoretical results.

Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

TL;DR

Abstract

Paper Structure (11 sections, 5 theorems, 10 equations, 2 figures, 1 algorithm)

This paper contains 11 sections, 5 theorems, 10 equations, 2 figures, 1 algorithm.

Introduction
System Model and Tools
Markov Decision Processes
Q-Learning
Coverage Coefficient
Theoretical Analysis
Assumptions and Preliminaries
Bounds on the coverage coefficient
Comparing the bounds of different environments
Numerical Results
Conclusions

Key Result

Proposition 1

Let $\pi=\pi^{(n)}$ (estimated policy of the $n^{th}$ environment in nEQL algorithm) in (Equ:CC). Then:

Figures (2)

Figure 1: Examples wireless network models.
Figure 2: Numerical simulations

Theorems & Definitions (5)

Proposition 1
Proposition 2
Proposition 3
Proposition 4
Proposition 5

Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

TL;DR

Abstract

Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (5)