Table of Contents
Fetching ...

Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments

Maria Rigaki, Carlos Catania, Sebastian Garcia

TL;DR

Hackphyr is presented, a locally fine-tuned LLM to be used as a red-team agent within network security environments and achieves performance comparable with much larger and more powerful commercial models such as GPT-4.5-turbo.

Abstract

Large Language Models (LLMs) have shown remarkable potential across various domains, including cybersecurity. Using commercial cloud-based LLMs may be undesirable due to privacy concerns, costs, and network connectivity constraints. In this paper, we present Hackphyr, a locally fine-tuned LLM to be used as a red-team agent within network security environments. Our fine-tuned 7 billion parameter model can run on a single GPU card and achieves performance comparable with much larger and more powerful commercial models such as GPT-4. Hackphyr clearly outperforms other models, including GPT-3.5-turbo, and baselines, such as Q-learning agents in complex, previously unseen scenarios. To achieve this performance, we generated a new task-specific cybersecurity dataset to enhance the base model's capabilities. Finally, we conducted a comprehensive analysis of the agents' behaviors that provides insights into the planning abilities and potential shortcomings of such agents, contributing to the broader understanding of LLM-based agents in cybersecurity contexts

Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments

TL;DR

Hackphyr is presented, a locally fine-tuned LLM to be used as a red-team agent within network security environments and achieves performance comparable with much larger and more powerful commercial models such as GPT-4.5-turbo.

Abstract

Large Language Models (LLMs) have shown remarkable potential across various domains, including cybersecurity. Using commercial cloud-based LLMs may be undesirable due to privacy concerns, costs, and network connectivity constraints. In this paper, we present Hackphyr, a locally fine-tuned LLM to be used as a red-team agent within network security environments. Our fine-tuned 7 billion parameter model can run on a single GPU card and achieves performance comparable with much larger and more powerful commercial models such as GPT-4. Hackphyr clearly outperforms other models, including GPT-3.5-turbo, and baselines, such as Q-learning agents in complex, previously unseen scenarios. To achieve this performance, we generated a new task-specific cybersecurity dataset to enhance the base model's capabilities. Finally, we conducted a comprehensive analysis of the agents' behaviors that provides insights into the planning abilities and potential shortcomings of such agents, contributing to the broader understanding of LLM-based agents in cybersecurity contexts
Paper Structure (37 sections, 3 equations, 10 figures, 4 tables)

This paper contains 37 sections, 3 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: LLM Agent Components
  • Figure 2: Supervised Fine-tuning Methodology
  • Figure 3: Win rate(%) confidence intervals by model. The blue area indicates the desired effect size.
  • Figure 4: Small and Full network scenarios rigaki_out_2024. The small scenario has only one client in the client subnet, while the full scenario has all five clients.
  • Figure 5: Three Subnets Scenario Topology. The clients can only access subnet A directly. The goal is to exfiltrate data from subnet B by first gaining a foothold to subnet A.
  • ...and 5 more figures