RL-Based Hyperparameter Selection for Spectrum Sensing With CNNs

Amir Mehrabian; Maryam Sabbaghian; Halim Yanikomeroglu

RL-Based Hyperparameter Selection for Spectrum Sensing With CNNs

Amir Mehrabian, Maryam Sabbaghian, Halim Yanikomeroglu

TL;DR

This work tackles hyperparameter and architecture selection for CNN-based spectrum sensing in cognitive radios by introducing a Q-learning–based NAS method that automatically constructs CNN detectors tailored to diverse signal, channel, and noise models. It also adds a reinforcement-learning framework for dynamic sensing-time adaptation treated as a multi-armed bandit, balancing throughput, interference, and energy use. The NAS-CNNs customized for three datasets outperform several state-of-the-art detectors, achieving notable gains in $P_c$ and ROC performance, while the sensing-time policy yields substantial rewards in non-stationary scenarios. The approach enables adaptive, resource-aware spectrum sensing that improves both detection reliability and efficiency in practical cognitive radio networks.

Abstract

Selection of hyperparameters in deep neural networks is a challenging problem due to the wide search space and emergence of various layers with specific hyperparameters. There exists an absence of consideration for the neural architecture selection of convolutional neural networks (CNNs) for spectrum sensing. Here, we develop a method using reinforcement learning and Q-learning to systematically search and evaluate various architectures for generated datasets including different signals and channels in the spectrum sensing problem. We show by extensive simulations that CNN-based detectors proposed by our developed method outperform several detectors in the literature. For the most complex dataset, the proposed approach provides 9% enhancement in accuracy at the cost of higher computational complexity. Furthermore, a novel method using multi-armed bandit model for selection of the sensing time is proposed to achieve higher throughput and accuracy while minimizing the consumed energy. The method dynamically adjusts the sensing time under the time-varying condition of the channel without prior information. We demonstrate through a simulated scenario that the proposed method improves the achieved reward by about 20% compared to the conventional policies. Consequently, this study effectively manages the selection of important hyperparameters for CNN-based detectors offering superior performance of cognitive radio network.

RL-Based Hyperparameter Selection for Spectrum Sensing With CNNs

TL;DR

and ROC performance, while the sensing-time policy yields substantial rewards in non-stationary scenarios. The approach enables adaptive, resource-aware spectrum sensing that improves both detection reliability and efficiency in practical cognitive radio networks.

Abstract

Paper Structure (15 sections, 13 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 15 sections, 13 equations, 12 figures, 7 tables, 1 algorithm.

Introduction
Signal model
Developing a Neural Architecture Search Method
State Parameters
Action Parameters
Episodes and Rewards
Updating Action-Value Functions
The Developed Algorithm for NAS
RL for Selecting Sensing Time
Simulations and Results
Receiver Operating Curve
Probability of Detection Versus GSNR
Discrepancy of Parameters and Complexity of Methods
Sensing Time Selection
Conclusion

Figures (12)

Figure 1: Time frames with sensing and transmission times for SU.
Figure 2: Probability of detection of detectors with long and short sensing times for three intervals of SNR.
Figure 3: $P_d$ versus SNR for detectors under Gaussian signal and noise.
Figure 4: ROC of detectors for Dataset 2 and test signals with $\mathsf{GSNR}=5$ dB.
Figure 5: ROC of detectors for Dataset 3 and test signals with $\mathsf{GSNR}=3$ dB.
...and 7 more figures

RL-Based Hyperparameter Selection for Spectrum Sensing With CNNs

TL;DR

Abstract

RL-Based Hyperparameter Selection for Spectrum Sensing With CNNs

Authors

TL;DR

Abstract

Table of Contents

Figures (12)