Table of Contents
Fetching ...

SoK: The Pitfalls of Deep Reinforcement Learning for Cybersecurity

Shae McFadden, Myles Foley, Elizabeth Bates, Ilias Tsingenopoulos, Sanyam Vyas, Vasilios Mavroudis, Chris Hicks, Fabio Pierazzi

TL;DR

This work tackles the gap between DRL successes in traditional domains and cybersecurity by articulating 11 methodological pitfalls that arise when applying DRL to security tasks. It introduces a four-stage taxonomy—modeling, training, evaluation, and deployment—and provides a prevalence analysis over 66 DRL4Sec papers (2018–2025), demonstrating that most papers exhibit multiple pitfalls. Through controlled experiments in autonomous cyber defense (MiniCAGE), adversarial malware generation (AutoRobust), and web security testing (Link/Sqirl), the authors quantify the practical impact of these pitfalls and reveal how they can inflate performance or undermine transfer to real deployments. The paper then offers concrete, actionable recommendations to improve reproducibility, reliability, and deployability of DRL-based security systems, aiming to raise methodological standards across the community.

Abstract

Deep Reinforcement Learning (DRL) has achieved remarkable success in domains requiring sequential decision-making, motivating its application to cybersecurity problems. However, transitioning DRL from laboratory simulations to bespoke cyber environments can introduce numerous issues. This is further exacerbated by the often adversarial, non-stationary, and partially-observable nature of most cybersecurity tasks. In this paper, we identify and systematize 11 methodological pitfalls that frequently occur in DRL for cybersecurity (DRL4Sec) literature across the stages of environment modeling, agent training, performance evaluation, and system deployment. By analyzing 66 significant DRL4Sec papers (2018-2025), we quantify the prevalence of each pitfall and find an average of over five pitfalls per paper. We demonstrate the practical impact of these pitfalls using controlled experiments in (i) autonomous cyber defense, (ii) adversarial malware creation, and (iii) web security testing environments. Finally, we provide actionable recommendations for each pitfall to support the development of more rigorous and deployable DRL-based security systems.

SoK: The Pitfalls of Deep Reinforcement Learning for Cybersecurity

TL;DR

This work tackles the gap between DRL successes in traditional domains and cybersecurity by articulating 11 methodological pitfalls that arise when applying DRL to security tasks. It introduces a four-stage taxonomy—modeling, training, evaluation, and deployment—and provides a prevalence analysis over 66 DRL4Sec papers (2018–2025), demonstrating that most papers exhibit multiple pitfalls. Through controlled experiments in autonomous cyber defense (MiniCAGE), adversarial malware generation (AutoRobust), and web security testing (Link/Sqirl), the authors quantify the practical impact of these pitfalls and reveal how they can inflate performance or undermine transfer to real deployments. The paper then offers concrete, actionable recommendations to improve reproducibility, reliability, and deployability of DRL-based security systems, aiming to raise methodological standards across the community.

Abstract

Deep Reinforcement Learning (DRL) has achieved remarkable success in domains requiring sequential decision-making, motivating its application to cybersecurity problems. However, transitioning DRL from laboratory simulations to bespoke cyber environments can introduce numerous issues. This is further exacerbated by the often adversarial, non-stationary, and partially-observable nature of most cybersecurity tasks. In this paper, we identify and systematize 11 methodological pitfalls that frequently occur in DRL for cybersecurity (DRL4Sec) literature across the stages of environment modeling, agent training, performance evaluation, and system deployment. By analyzing 66 significant DRL4Sec papers (2018-2025), we quantify the prevalence of each pitfall and find an average of over five pitfalls per paper. We demonstrate the practical impact of these pitfalls using controlled experiments in (i) autonomous cyber defense, (ii) adversarial malware creation, and (iii) web security testing environments. Finally, we provide actionable recommendations for each pitfall to support the development of more rigorous and deployable DRL-based security systems.
Paper Structure (32 sections, 4 figures, 5 tables)

This paper contains 32 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Common pitfalls of DRL when applied to cybersecurity, organized by development stage and relevant case studies.
  • Figure 2: Training performance of different hyperparameter sets, across 20 runs using 95% CI for variance. Zero is the theoretical max score in MiniCAGE.
  • Figure 3: Comparison of mean with 95/99% CI performance for trained and random policies in AutoRobust across 20 runs with 20 evaluation episodes each.
  • Figure 4: Complete breakdown of pitfall prevalence