Table of Contents
Fetching ...

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran peng, Qi Tian

TL;DR

LogFormer tackles cross-domain log anomaly detection by learning shared semantics from a source domain and transferring them to target domains via adapters. It introduces a Log-Attention module that preserves information lost in log parsing and leverages lightweight adapters to keep the parameter count low. The approach uses a two-stage training with a supervised pre-training stage and adapter-based tuning, achieving state-of-the-art $F_1$ scores on multiple benchmarks including the GAIA dataset, with reduced training cost. The work demonstrates practical scalability for IT operations where log data evolve across domains.

Abstract

Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios. However, previous deep models merely focused on extracting the semantics of log sequences in the same domain, leading to poor generalization on multi-domain logs. To alleviate this issue, we propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains, where we establish a two-stage process including the pre-training and adapter-based tuning stage. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters. Besides, the Log-Attention module is proposed to supplement the information ignored by the log-paring. The proposed method is evaluated on three public and one real-world datasets. Experimental results on multiple benchmarks demonstrate the effectiveness of our LogFormer with fewer trainable parameters and lower training costs.

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

TL;DR

LogFormer tackles cross-domain log anomaly detection by learning shared semantics from a source domain and transferring them to target domains via adapters. It introduces a Log-Attention module that preserves information lost in log parsing and leverages lightweight adapters to keep the parameter count low. The approach uses a two-stage training with a supervised pre-training stage and adapter-based tuning, achieving state-of-the-art scores on multiple benchmarks including the GAIA dataset, with reduced training cost. The work demonstrates practical scalability for IT operations where log data evolve across domains.

Abstract

Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios. However, previous deep models merely focused on extracting the semantics of log sequences in the same domain, leading to poor generalization on multi-domain logs. To alleviate this issue, we propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains, where we establish a two-stage process including the pre-training and adapter-based tuning stage. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters. Besides, the Log-Attention module is proposed to supplement the information ignored by the log-paring. The proposed method is evaluated on three public and one real-world datasets. Experimental results on multiple benchmarks demonstrate the effectiveness of our LogFormer with fewer trainable parameters and lower training costs.
Paper Structure (30 sections, 8 equations, 9 figures, 8 tables)

This paper contains 30 sections, 8 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: The same anomaly from multiple domains. The top part denotes the "Unusual End of Program" anomaly from three domains including BGL, Thunderbird, and Red Storm while the bottom part is the "Program Not Running" from four domains including BGL, Thunderbird, Spirit, and Liberty.
  • Figure 2: Logs and Templates. The top part is unstructured logs, we adopt Drain algorithm to extract log templates,then we match each log with its template, which is the middle part. The bottom part is structured inputs.
  • Figure 3: Overview of architecture. Log sequences are first fed into the pre-trained language model to extract features. The Log-Attention encoder is trained on the source domain to acquire shared semantic information. Then, we initialize the encoder and only tune the parameters of the adapter on the target domain to transfer the knowledge.
  • Figure 4: Log-Attention. The left part is the multi-head attention, and the right part is the parameter encoding.
  • Figure 5: Encoder with Adapters. Where $N$ is the number of encoder layers. The left part describes the log-attention encoder inserted by parallel adapters, and the right part is the structure of an adapter, which is composed of the down- and up-projection layers.
  • ...and 4 more figures