Table of Contents
Fetching ...

RFRL Gym: A Reinforcement Learning Testbed for Cognitive Radio Applications

Daniel Rosen, Illa Rochez, Caleb McIrvin, Joshua Lee, Kevin D'Alessandro, Max Wiecek, Nhan Hoang, Ramzy Saffarini, Sam Philips, Vanessa Jones, Will Ivey, Zavier Harris-Smart, Zavion Harris-Smart, Zayden Chin, Amos Johnson, Alyse M. Jones, William C. Headley

TL;DR

The paper tackles RF spectrum congestion and interference by introducing RFRL Gym, a reinforcement learning testbed for cognitive radio applications that is OpenAI Gym–compatible and highly configurable. It provides modular components for non-player and RL entities, multiple reward and observation modes, rendering options, and JSON-defined scenarios, enabling realistic RL experimentation in spectrum access and jamming tasks. Through example scenarios (single-entity jamming, non-Markovian hopping, solvable and unsolvable multi-entity DSA) and a Mushroom RL–driven training workflow, the work demonstrates the environment’s capability to reveal learning behavior, convergence, and limitations, including the need for memory or more advanced algorithms in non-stationary settings. Looking ahead, the authors plan to add multi-agent RL support, physical signal integration with hardware, and GUI enhancements to broaden accessibility and practical deployment, aiming to accelerate RL research in wireless communications.

Abstract

Radio Frequency Reinforcement Learning (RFRL) is anticipated to be a widely applicable technology in the next generation of wireless communication systems, particularly 6G and next-gen military communications. Given this, our research is focused on developing a tool to promote the development of RFRL techniques that leverage spectrum sensing. In particular, the tool was designed to address two cognitive radio applications, specifically dynamic spectrum access and jamming. In order to train and test reinforcement learning (RL) algorithms for these applications, a simulation environment is necessary to simulate the conditions that an agent will encounter within the Radio Frequency (RF) spectrum. In this paper, such an environment has been developed, herein referred to as the RFRL Gym. Through the RFRL Gym, users can design their own scenarios to model what an RL agent may encounter within the RF spectrum as well as experiment with different spectrum sensing techniques. Additionally, the RFRL Gym is a subclass of OpenAI gym, enabling the use of third-party ML/RL Libraries. We plan to open-source this codebase to enable other researchers to utilize the RFRL Gym to test their own scenarios and RL algorithms, ultimately leading to the advancement of RL research in the wireless communications domain. This paper describes in further detail the components of the Gym, results from example scenarios, and plans for future additions. Index Terms-machine learning, reinforcement learning, wireless communications, dynamic spectrum access, OpenAI gym

RFRL Gym: A Reinforcement Learning Testbed for Cognitive Radio Applications

TL;DR

The paper tackles RF spectrum congestion and interference by introducing RFRL Gym, a reinforcement learning testbed for cognitive radio applications that is OpenAI Gym–compatible and highly configurable. It provides modular components for non-player and RL entities, multiple reward and observation modes, rendering options, and JSON-defined scenarios, enabling realistic RL experimentation in spectrum access and jamming tasks. Through example scenarios (single-entity jamming, non-Markovian hopping, solvable and unsolvable multi-entity DSA) and a Mushroom RL–driven training workflow, the work demonstrates the environment’s capability to reveal learning behavior, convergence, and limitations, including the need for memory or more advanced algorithms in non-stationary settings. Looking ahead, the authors plan to add multi-agent RL support, physical signal integration with hardware, and GUI enhancements to broaden accessibility and practical deployment, aiming to accelerate RL research in wireless communications.

Abstract

Radio Frequency Reinforcement Learning (RFRL) is anticipated to be a widely applicable technology in the next generation of wireless communication systems, particularly 6G and next-gen military communications. Given this, our research is focused on developing a tool to promote the development of RFRL techniques that leverage spectrum sensing. In particular, the tool was designed to address two cognitive radio applications, specifically dynamic spectrum access and jamming. In order to train and test reinforcement learning (RL) algorithms for these applications, a simulation environment is necessary to simulate the conditions that an agent will encounter within the Radio Frequency (RF) spectrum. In this paper, such an environment has been developed, herein referred to as the RFRL Gym. Through the RFRL Gym, users can design their own scenarios to model what an RL agent may encounter within the RF spectrum as well as experiment with different spectrum sensing techniques. Additionally, the RFRL Gym is a subclass of OpenAI gym, enabling the use of third-party ML/RL Libraries. We plan to open-source this codebase to enable other researchers to utilize the RFRL Gym to test their own scenarios and RL algorithms, ultimately leading to the advancement of RL research in the wireless communications domain. This paper describes in further detail the components of the Gym, results from example scenarios, and plans for future additions. Index Terms-machine learning, reinforcement learning, wireless communications, dynamic spectrum access, OpenAI gym
Paper Structure (26 sections, 1 equation, 7 figures, 3 tables)

This paper contains 26 sections, 1 equation, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The RFRL Gym framework, illustrating how user-defined scenarios and reinforcement learning algorithms interact with the gym environment. This framework facilitates simulating an RF-based reinforcement learning interaction loop of the RL agent taking in states (e.g. sensing results) and rewards (e.g. receiver performance) and providing to the gym its determined next action (e.g. transmitter tuning).
  • Figure 2: GUI-based scenario generator where users can set up the environment, render, and entities.
  • Figure 3: Cumulative reward results for Scenario 1, showing optimal behavior by converging to the optimal reward of 100
  • Figure 4: PyQt render results for scenario 2, shown in advanced mode where IQ data is viewed. The diagonal signal is the RL agent, while the horizontal signal is the non-RL entity the agent is targeting to jam.
  • Figure 5: Scenario 3 render results, showing before and after convergence behavior, in the gamified detect mode.
  • ...and 2 more figures