Table of Contents
Fetching ...

LLMs for Domain Generation Algorithm Detection

Reynier Leyva La O, Carlos A. Catania, Tatiana Parlanti

Abstract

This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats without requiring much retraining. We use Meta's Llama3 8B model, on a custom dataset with 68 malware families and normal domains, covering several hard-to-detect schemes, including recent word-based DGAs. Results proved that LLM-based methods can achieve competitive results in DGA detection. In particular, the SFT-based LLM DGA detector outperforms state-of-the-art models using attention layers, achieving 94% accuracy with a 4% false positive rate (FPR) and excelling at detecting word-based DGA domains.

LLMs for Domain Generation Algorithm Detection

Abstract

This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats without requiring much retraining. We use Meta's Llama3 8B model, on a custom dataset with 68 malware families and normal domains, covering several hard-to-detect schemes, including recent word-based DGAs. Results proved that LLM-based methods can achieve competitive results in DGA detection. In particular, the SFT-based LLM DGA detector outperforms state-of-the-art models using attention layers, achieving 94% accuracy with a 4% false positive rate (FPR) and excelling at detecting word-based DGA domains.

Paper Structure

This paper contains 17 sections, 12 figures, 9 tables.

Figures (12)

  • Figure 1: Distribution of domains for the different training methods.
  • Figure 2: Example format for training data (for SFT), including domain and label.
  • Figure 3: Prompt used on Llama3 8B for classification of domain names in ICL.
  • Figure 4: Model evaluation diagram.
  • Figure 5: Systematic sampling: 30 samples of 50 legit and 50 DGA domains. Each circle represents 50 different domains, and all circles are disjoint.
  • ...and 7 more figures