Table of Contents
Fetching ...

Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection

Ying Fu Lim, Jiawen Zhu, Guansong Pang

TL;DR

This work tackles log anomaly detection (LAD) by leveraging parameter-efficient fine-tuning (PEFT) to adapt large language models (LLMs) without full fine-tuning costs. It conducts a systematic, cross-model comparison of two PEFT techniques—LoRA and ReFT—applied to RoBERTa, GPT-2, and Llama-3 across four public LAD datasets, evaluating efficacy, stability, sample efficiency, robustness, and cross-dataset generalization. The findings show that ReFT generally delivers stronger performance and generalization than LoRA, albeit with higher training time, and that larger backbones like Llama-3 amplify gains, suggesting practical guidance for deploying LLM-driven LAD under varying resource constraints. The study provides valuable benchmarks and insights for parameter-efficient LAD, supported by code availability for reproducibility.

Abstract

Log Anomaly Detection (LAD) seeks to identify atypical patterns in log data that are crucial to assessing the security and condition of systems. Although Large Language Models (LLMs) have shown tremendous success in various fields, the use of LLMs in enabling the detection of log anomalies is largely unexplored. This work aims to fill this gap. Due to the prohibitive costs involved in fully fine-tuning LLMs, we explore the use of parameter-efficient fine-tuning techniques (PEFTs) for adapting LLMs to LAD. To have an in-depth exploration of the potential of LLM-driven LAD, we present a comprehensive investigation of leveraging two of the most popular PEFTs -- Low-Rank Adaptation (LoRA) and Representation Fine-tuning (ReFT) -- to tap into three prominent LLMs of varying size, including RoBERTa, GPT-2, and Llama-3, for parameter-efficient LAD. Comprehensive experiments on four public log datasets are performed to reveal important insights into effective LLM-driven LAD in several key perspectives, including the efficacy of these PEFT-based LLM-driven LAD methods, their stability, sample efficiency, robustness w.r.t. unstable logs, and cross-dataset generalization. Code is available at https://github.com/mala-lab/LogADReft.

Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection

TL;DR

This work tackles log anomaly detection (LAD) by leveraging parameter-efficient fine-tuning (PEFT) to adapt large language models (LLMs) without full fine-tuning costs. It conducts a systematic, cross-model comparison of two PEFT techniques—LoRA and ReFT—applied to RoBERTa, GPT-2, and Llama-3 across four public LAD datasets, evaluating efficacy, stability, sample efficiency, robustness, and cross-dataset generalization. The findings show that ReFT generally delivers stronger performance and generalization than LoRA, albeit with higher training time, and that larger backbones like Llama-3 amplify gains, suggesting practical guidance for deploying LLM-driven LAD under varying resource constraints. The study provides valuable benchmarks and insights for parameter-efficient LAD, supported by code availability for reproducibility.

Abstract

Log Anomaly Detection (LAD) seeks to identify atypical patterns in log data that are crucial to assessing the security and condition of systems. Although Large Language Models (LLMs) have shown tremendous success in various fields, the use of LLMs in enabling the detection of log anomalies is largely unexplored. This work aims to fill this gap. Due to the prohibitive costs involved in fully fine-tuning LLMs, we explore the use of parameter-efficient fine-tuning techniques (PEFTs) for adapting LLMs to LAD. To have an in-depth exploration of the potential of LLM-driven LAD, we present a comprehensive investigation of leveraging two of the most popular PEFTs -- Low-Rank Adaptation (LoRA) and Representation Fine-tuning (ReFT) -- to tap into three prominent LLMs of varying size, including RoBERTa, GPT-2, and Llama-3, for parameter-efficient LAD. Comprehensive experiments on four public log datasets are performed to reveal important insights into effective LLM-driven LAD in several key perspectives, including the efficacy of these PEFT-based LLM-driven LAD methods, their stability, sample efficiency, robustness w.r.t. unstable logs, and cross-dataset generalization. Code is available at https://github.com/mala-lab/LogADReft.

Paper Structure

This paper contains 20 sections, 5 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Pipeline of the PEFT-based LLM-driven LAD approaches.
  • Figure 2: Training time per epoch against rank for various fine-tuned LLMs on BGL dataset.
  • Figure 3: F1 score against rank for various finetuned LLMs on various datasets.
  • Figure 4: F1 score against ratio of the dataset for Llama3 on various dataset.
  • Figure 5: F1 change (%) against injection percentage (%) for Llama3 on HDFS dataset