FastLogAD: Log Anomaly Detection with Mask-Guided Pseudo Anomaly Generation and Discrimination
Yifei Lin, Hanqiu Deng, Xingyu Li
TL;DR
FastLogAD tackles fast, unsupervised log anomaly detection by combining a generator to produce pseudo anomalies with a discriminative one-class detector. It employs MGAG to create realistic perturbations of normal log sequences and trains a discriminator with RTD and HST to tightly cluster normal embeddings while pushing anomalies outward, using the CLS embedding norm as the anomaly score. The threshold ε is set from normal validation data (ε = quantile_{0.99}(||φ_{θ_D}(s)_{cls}||_2)), enabling detection without access to real anomalies during testing. On HDFS, BGL, and Thunderbird datasets, FastLogAD achieves competitive or superior F1 scores and at least 10x faster anomaly detection than prior methods, highlighting its practical potential for real-time, domain-specific log analysis.
Abstract
Nowadays large computers extensively output logs to record the runtime status and it has become crucial to identify any suspicious or malicious activities from the information provided by the realtime logs. Thus, fast log anomaly detection is a necessary task to be implemented for automating the infeasible manual detection. Most of the existing unsupervised methods are trained only on normal log data, but they usually require either additional abnormal data for hyperparameter selection or auxiliary datasets for discriminative model optimization. In this paper, aiming for a highly effective discriminative model that enables rapid anomaly detection,we propose FastLogAD, a generator-discriminator framework trained to exhibit the capability of generating pseudo-abnormal logs through the Mask-Guided Anomaly Generation (MGAG) model and efficiently identifying the anomalous logs via the Discriminative Abnormality Separation (DAS) model. Particularly, pseudo-abnormal logs are generated by replacing randomly masked tokens in a normal sequence with unlikely candidates. During the discriminative stage, FastLogAD learns a distinct separation between normal and pseudoabnormal samples based on their embedding norms, allowing the selection of a threshold without exposure to any test data and achieving competitive performance. Extensive experiments on several common benchmarks show that our proposed FastLogAD outperforms existing anomaly detection approaches. Furthermore, compared to previous methods, FastLogAD achieves at least x10 speed increase in anomaly detection over prior work. Our implementation is available at https://github.com/YifeiLin0226/FastLogAD.
