A Survey of Large Language Models in Cybersecurity

Gabriel de Jesus Coelho da Silva; Carlos Becker Westphall

A Survey of Large Language Models in Cybersecurity

Gabriel de Jesus Coelho da Silva, Carlos Becker Westphall

TL;DR

This survey analyzes how large language models are applied in cybersecurity, spanning autonomous pentesting, vulnerability repair, phishing detection, and CTF challenge solving. It highlights key limitations including context loss, hallucinations, and the need for reliable grounding, proposing a Mixture-of-Experts framework that delegates tasks to specialized LLMs to improve reliability and scalability. The work synthesizes prominent studies (PentestGPT, zero-shot vulnerability repair, and AI-assisted CTFs) to map current capabilities and gaps, and it outlines concrete future directions such as domain-specific fine-tuning, RAG, CoVe, and governance. The findings inform researchers and practitioners about practical deployment considerations and pave the way for safer, more capable AI-assisted cybersecurity tooling.

Abstract

Large Language Models (LLMs) have quickly risen to prominence due to their ability to perform at or close to the state-of-the-art in a variety of fields while handling natural language. An important field of research is the application of such models at the cybersecurity context. This survey aims to identify where in the field of cybersecurity LLMs have already been applied, the ways in which they are being used and their limitations in the field. Finally, suggestions are made on how to improve such limitations and what can be expected from these systems once these limitations are overcome.

A Survey of Large Language Models in Cybersecurity

TL;DR

Abstract

Paper Structure (25 sections, 2 figures)

This paper contains 25 sections, 2 figures.

Introduction
Motivation
Justifications
Objectives
General objectives
Specific objectives
Paper organization
Basic concepts
Cybersecurity
Neural Networks
Natural Language Processing and Large Language Models
Related work
PentestGPT: An LLM-empowered Automatic Penetration Testing Tool
Getting pwn'd by AI: Penetration Testing with Large Language Models
Examining Zero-Shot Vulnerability Repair with Large Language Models
...and 10 more sections

Figures (2)

Figure 1: Number of publications per year per keyword
Figure 2: Diagram of the proposed system.

A Survey of Large Language Models in Cybersecurity

TL;DR

Abstract

A Survey of Large Language Models in Cybersecurity

Authors

TL;DR

Abstract

Table of Contents

Figures (2)