LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection
Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran peng, Qi Tian
TL;DR
LogFormer tackles cross-domain log anomaly detection by learning shared semantics from a source domain and transferring them to target domains via adapters. It introduces a Log-Attention module that preserves information lost in log parsing and leverages lightweight adapters to keep the parameter count low. The approach uses a two-stage training with a supervised pre-training stage and adapter-based tuning, achieving state-of-the-art $F_1$ scores on multiple benchmarks including the GAIA dataset, with reduced training cost. The work demonstrates practical scalability for IT operations where log data evolve across domains.
Abstract
Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios. However, previous deep models merely focused on extracting the semantics of log sequences in the same domain, leading to poor generalization on multi-domain logs. To alleviate this issue, we propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains, where we establish a two-stage process including the pre-training and adapter-based tuning stage. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters. Besides, the Log-Attention module is proposed to supplement the information ignored by the log-paring. The proposed method is evaluated on three public and one real-world datasets. Experimental results on multiple benchmarks demonstrate the effectiveness of our LogFormer with fewer trainable parameters and lower training costs.
