Table of Contents
Fetching ...

Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

Talha Bozkus, Urbashi Mitra

TL;DR

Numerical simulations show that the proposed algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms and exhibits robustness to changes in network settings and parameters.

Abstract

Q-learning is widely used to optimize wireless networks with unknown system dynamics. Recent advancements include ensemble multi-environment hybrid Q-learning algorithms, which utilize multiple Q-learning algorithms across structurally related but distinct Markovian environments and outperform existing Q-learning algorithms in terms of accuracy and complexity in large-scale wireless networks. We herein conduct a comprehensive coverage analysis to ensure optimal data coverage conditions for these algorithms. Initially, we establish upper bounds on the expectation and variance of different coverage coefficients. Leveraging these bounds, we present an algorithm for efficient initialization of these algorithms. We test our algorithm on two distinct real-world wireless networks. Numerical simulations show that our algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms. Furthermore, our algorithm exhibits robustness to changes in network settings and parameters. We also numerically validate our theoretical results.

Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization

TL;DR

Numerical simulations show that the proposed algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms and exhibits robustness to changes in network settings and parameters.

Abstract

Q-learning is widely used to optimize wireless networks with unknown system dynamics. Recent advancements include ensemble multi-environment hybrid Q-learning algorithms, which utilize multiple Q-learning algorithms across structurally related but distinct Markovian environments and outperform existing Q-learning algorithms in terms of accuracy and complexity in large-scale wireless networks. We herein conduct a comprehensive coverage analysis to ensure optimal data coverage conditions for these algorithms. Initially, we establish upper bounds on the expectation and variance of different coverage coefficients. Leveraging these bounds, we present an algorithm for efficient initialization of these algorithms. We test our algorithm on two distinct real-world wireless networks. Numerical simulations show that our algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms. Furthermore, our algorithm exhibits robustness to changes in network settings and parameters. We also numerically validate our theoretical results.
Paper Structure (11 sections, 5 theorems, 10 equations, 2 figures, 1 algorithm)

This paper contains 11 sections, 5 theorems, 10 equations, 2 figures, 1 algorithm.

Key Result

Proposition 1

Let $\pi=\pi^{(n)}$ (estimated policy of the $n^{th}$ environment in nEQL algorithm) in (Equ:CC). Then:

Figures (2)

  • Figure 1: Examples wireless network models.
  • Figure 2: Numerical simulations

Theorems & Definitions (5)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 5