Fine-tuning of Large Language Models for Domain-Specific Cybersecurity Knowledge
Yuan Huang
TL;DR
This work addresses the challenge of adapting general-purpose LLMs to domain-specific cybersecurity knowledge for QA tasks. It compares three fine-tuning strategies—Supervised Fine-Tuning (SFT), Low-Rank Adaptation (LoRA), and Quantized Low-Rank Adaptation (QLoRA)—using the Llama 3 base model and the CyberMetric-10000 dataset. Results show that all fine-tuning methods outperform zero-shot baselines, with LoRA and QLoRA delivering comparable or better accuracy and efficiency than SFT, thanks to parameter-efficient updates and 4-bit quantization. The findings highlight the practicality of low-rank, quantized fine-tuning as a scalable pathway to embed domain expertise in large language models for cybersecurity applications, enabling faster adaptation on limited hardware.
Abstract
Recent advancements in training paradigms for Large Language Models (LLMs) have unlocked their remarkable capabilities in natural language processing and cross-domain generalization. While LLMs excel in tasks like programming and mathematical problem-solving, their zero-shot performance in specialized domains requiring expert knowledge, such as cybersecurity, is often suboptimal. This limitation arises because foundational LLMs are designed for general-purpose applications, constraining their ability to encapsulate domain-specific expertise within their parameter space. To address this, we explore fine-tuning strategies to embed cybersecurity knowledge into LLMs, enhancing their performance in cybersecurity question-answering (Q\&A) tasks while prioritizing computational efficiency. Specifically, we investigate Supervised Fine-Tuning (SFT), Low-Rank Adaptation (LoRA), and Quantized Low-Rank Adaptation (QLoRA) using a cybersecurity Q\&A dataset. Our results demonstrate that these fine-tuning approaches significantly outperform the foundational model in cybersecurity Q\&A tasks. Moreover, LoRA and QLoRA achieve comparable performance to SFT with substantially lower computational costs, offering an efficient pathway for adapting LLMs to specialized domains. Our work highlights the potential of low-rank fine-tuning strategies to bridge the gap between general-purpose LLMs and domain-specific applications.
